Hadoop 1 v/s 2


Sl No
Hadoop1
Hadoop2
1
Supports MapReduce (MR) processing model only.
Does  not support Non MR tools.
Allows to work in MR as well as other distributed computing
models like Spark, Hama, Giraph, Message Passing Interface)
 MPI & HBase coprocessors.
2
MR does both processing and cluster-resource management.
YARN (Yet Another Resource Negotiator) does cluster resource
management and processing is done using different processing
models.
3
Has limited scaling of nodes. Limited to 4000 nodes per cluster
Has better scalability. Scalable up to 10000 nodes per cluster
4
Works on concepts of slots – slots can run either a Map task or a Reduce task only.
Works on concepts of containers. Using containers can run
generic tasks.
5
A single Namenode to manage the entire namespace.
Multiple Namenode servers manage multiple namespaces.
6
Has Single-Point-of-Failure (SPOF) – because of single Namenode- and in the case of Namenode failure, needs manual intervention to overcome.
Has to feature to overcome SPOF with a standby Namenode
and in the case of Namenode failure, it is configured for automatic
 recovery.
7
MR API is compatible with Hadoop1x. A program written in Hadoop1 executes in Hadoop1x without any additional files.
MR API requires additional files for a program written in
Hadoop1x to execute in Hadoop2x.
8
Has a limitation to serve as a platform for event processing, streaming and real-time operations.
Can serve as a platform for a wide variety of data analytics-
possible to run event processing, streaming and real-time
operations.
9
A Namenode failure affects the stack.
The Hadoop stack – Hive, Pig, HBase etc. are all
equipped to handle Namenode failure.
10
Does not support Microsoft Windows
Added support for Microsoft windows

Comments

Popular posts from this blog

Apache Hive