Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What's the difference between Hadoop2 and Hadoop3?

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces what is the difference between Hadoop2 and Hadoop3, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.

Comparison between Hadoop 2.x and Hadoop 3.x

This section covers 22 differences between Hadoop 2.x and Hadoop 3.x. Now let's discuss one by one.

2.1License

Hadoop 2.x-Apache 2.0, open source

Hadoop 3.x-Apache 2.0, open source

2.2 minimum supported Java version

The minimum supported version of Hadoop 2.x-java is java 7

The minimum supported version of Hadoop 3.x-java is java 8

2.3 Fault tolerance

Hadoop 2.x-Fault tolerance can be handled by copying (wasting space).

Hadoop 3.x-Fault tolerance can be handled by Erasure encoding.

2.4 data balance

Hadoop 2.x-use the HDFS balancer for data balancing.

Hadoop 3.x-use the Intra-data node balancer for data balancing, which is called through the HDFS disk balancer CLI.

2.5 Storage Scheme

Hadoop 2.x-use 3X copy Scheme

Hadoop 3.x-supports erasure coding in HDFS.

2.6 Storage overhead

Hadoop 2.x-HDFS has 200% overhead in storage space.

Hadoop 3.x-the storage overhead is only 50%.

2.7 example of storage overhead

Hadoop 2.x-if there are six blocks, there will be 18 blocks taking up space due to the replica scheme (Scheme).

Hadoop 3.x-if there are 6 blocks, then 9 blocks of space, 6 blocks of space, and 3 blocks for parity.

2.8YARN timeline service

Hadoop 2.x-use the old timeline service with scalability issues.

Hadoop 3.x-improve timeline services v2 and improve the scalability and reliability of timeline services.

2.9 default port range

Hadoop 2.x-in Hadoop 2.0, some of the default ports are Linux temporary port ranges. So at startup, they will not be able to bind.

Hadoop 3.x-but in Hadoop 3.0, these ports have been moved out of range for a short period of time.

2.10 tools

Hadoop 2.x-use Hive,pig,Tez,Hama,Giraph and other Hadoop tools.

Hadoop 3.x-Hive,pig,Tez,Hama,Giraph and other Hadoop tools are available.

2.11 compatible file system

Hadoop 2.x-HDFS (default FS), FTP file system: it stores all data on a remotely accessible FTP server. Amazon S3 (simple Storage Service) File system Windows Azure Storage Blob (WASB) file system.

Hadoop 3.x-it supports all front and Microsoft Azure Data Lake file systems.

2.12Datanode resources

The Hadoop 2.x-Datanode resource is not dedicated to MapReduce, and we can use it for other applications.

Hadoop 3.x-the data node resources here are also available for other applications.

2.13MR API compatibility

Hadoop 2.x-MR API compatible with Hadoop 1.x programs, which can be executed on Hadoop 2.x

Hadoop 3.x-here, MR API is compatible with running the Hadoop 1.x program for execution on Hadoop 3.x

2.14 support Microsoft Windows

Hadoop 2.x-it can be deployed on Windows.

Hadoop 3.x-it also supports Windows.

2.15 slots / containers

Hadoop 2.x-Hadoop 1 applies to the concept of slots, but Hadoop 2.x applies to the concept of containers. Through the container, we can run common tasks.

Hadoop 3.x-it also applies to the concept of containers.

2.16 single point of failure

Hadoop 2.x-has the function of SPOF, so if Namenode fails, it will recover automatically.

Hadoop 3.x-has the function of SPOF, so whenever Namenode fails, it automatically recovers and can be overcome without human intervention.

2.17HDFS Alliance

Hadoop 2.x-in Hadoop 1.0, there is only one NameNode to manage all Namespace, but in Hadoop 2.0, multiple NameNode is used for multiple Namespace.

Hadoop 3.x-Hadoop 3.x also has multiple namespaces for multiple namespaces.

2.18 scalability

Hadoop 2.x-We can scale to 10000 nodes per cluster.

Hadoop 3.x-better scalability. We can expand more than 10000 nodes for each cluster.

2.19 faster access to data

Hadoop 2.x-because of the data node cache, we can access the data quickly.

Hadoop 3.x-here we also have quick access to data through Datanode caching.

2.20HDFS Snapshot

Hadoop 2.x-Hadoop 2 adds support for snapshots. It provides disaster recovery and protection for user errors.

Hadoop 3.x-Hadoop 2 also supports snapshots.

2.21 platform

Hadoop 2.x-can be used as a platform for a variety of data analysis, can run event processing, streaming and real-time operations.

Hadoop 3.x-you can also run event handling, streaming and real-time operations on top of YARN.

2.22 Cluster resource management

Hadoop 2.x-for cluster resource management, it uses YARN. It improves scalability, high availability, and multi-tenancy.

Hadoop 3.x-for clusters, resource management uses YARN with all functions.

Improvement of hadoop3.X over hadoop2.x

The main improvements of Common:

Shell script rewrite

Obsolete API deletion

HDFS improvements:

Support for erasure encoding

Support for more than two namenode

Data equalization

Multiple service ports have changed

Yarn improvements:

YARN Timeline Service v.2

Support for Opportunistic Containers and Distributed Scheduling

MapRduece improvements:

MapReduce task-level native optimization

Reworked daemon and task heap management

Other new features:

Shared client jars

Thank you for reading this article carefully. I hope the article "what's the difference between Hadoop2 and Hadoop3" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report