Category: Hadoop

0

The Graph Database Model

The Graph Database Model For too long, relational database management has been the de facto standard for most of us looking to store data and retrieve the different values as and when desired. However,...

0

Apache TinkerPop

The Gremlin Graph Traversal Machine and Language Apache TinkerPop ™ is a graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP). Gremlin is the graph traversal language of Apache TinkerPop....

0

What is Hadoop?

What is Hadoop ? what is Hadoop ? Hadoop is a complete eco-system of open source framework from Apache . It is used to store, process and analyze data which are very huge in...

0

Big Data

“Big Data” Just as the name itself implies ‘Big Data’ is  huge amount data with complex structure and that grows exponentially with time. Such a data is so large and complex that none of...

0

MapReduce Introduction

To begin, lets start with the most basic question of what is hadoops Mapreduce? MapReduce is a Simple programming model for data processing. MapReduce is an inherently parallel processing unit. Which essentially consists of...

0

Scheduling and types of scheduler in YARN

In an ideal world, the requests that a YARN application makes would be granted immediately. In the real world, however, resources are limited, and on a busy cluster, an application will often need to...

0

Yarn Comparison to MapReduce

Major Difference between map reduce 1  and mapreduce2 i.e YARN. In MapReduce 1, there are two types of daemon that control the job execution process: a jobtracker and one or more tasktrackers. The jobtracker...

0

YARN Introduction

Apache YARN introduction: it is short for Yet Another Resource Negotiator. As the name indicates it is a Hadoop’s cluster resource management system. YARN was introduced in Hadoop version 2 to improve the MapReduce...

0

Hadoop File System and Operations

Hadoop has an abstract notion of filesystems, of which HDFS is just one implementation. First we see local, now it’s a filesystem for a locally connected disk with client-side checksums. Then hdfs i.e. Hadoop’s...