Hadoop 1 v/s 2

Hadoop 1 v/s 2

Sl No	Hadoop1	Hadoop2
1	Supports MapReduce (MR) processing model only. Does not support Non MR tools.	Allows to work in MR as well as other distributed computing models like Spark, Hama, Giraph, Message Passing Interface) MPI & HBase coprocessors.
2	MR does both processing and cluster-resource management.	YARN (Yet Another Resource Negotiator) does cluster resource management and processing is done using different processing models.
3	Has limited scaling of nodes. Limited to 4000 nodes per cluster	Has better scalability. Scalable up to 10000 nodes per cluster
4	Works on concepts of slots – slots can run either a Map task or a Reduce task only.	Works on concepts of containers. Using containers can run generic tasks.
5	A single Namenode to manage the entire namespace.	Multiple Namenode servers manage multiple namespaces.
6	Has Single-Point-of-Failure (SPOF) – because of single Namenode- and in the case of Namenode failure, needs manual intervention to overcome.	Has to feature to overcome SPOF with a standby Namenode and in the case of Namenode failure, it is configured for automatic recovery.
7	MR API is compatible with Hadoop1x. A program written in Hadoop1 executes in Hadoop1x without any additional files.	MR API requires additional files for a program written in Hadoop1x to execute in Hadoop2x.
8	Has a limitation to serve as a platform for event processing, streaming and real-time operations.	Can serve as a platform for a wide variety of data analytics- possible to run event processing, streaming and real-time operations.
9	A Namenode failure affects the stack.	The Hadoop stack – Hive, Pig, HBase etc. are all equipped to handle Namenode failure.
10	Does not support Microsoft Windows	Added support for Microsoft windows

Comments