In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Overview of Hadoop
Hadoop is a distributed system infrastructure developed by the Apache Foundation.
Users can develop distributed programs without knowing the underlying details of the distribution, and make full use of the power of the cluster for high-speed operation and storage.
Hadoop implements a distributed file system (Hadoop Distributed File System), referred to as HDFS. HDFS has high fault tolerance and is designed to be deployed on low-cost (low-cost) hardware; and it provides high throughput (high throughput) to access application data, suitable for applications with very large data sets (large data set).
The core design of Hadoop's framework is: HDFS and MapReduce. HDFS provides storage for massive data, while MapReduce provides computing for massive data.
Distributed storage in the distributed storage system, data scattered in different nodes may belong to the same file, in order to organize a large number of files, the files can be put into different folders, folders can be included at one level. We call this form of organization namespace. Namespaces manage all files in the entire server cluster. Distributed computing divides a problem that requires a lot of computing power into many small parts, then allocates these parts to many computers for processing, and finally synthesizes these calculation results to get the final result. Hadoop Associated Project
AmbariTM: an operation tool based on web that can provide resources, monitor and manage Hadoop clusters.
AvroTM: data serialization system.
HBaseTM: an extensible, distributed database that supports large table storage of structured data.
HiveTM: a data warehouse infrastructure that supports data summarization and temporary queries.
MahoutTM: an extensible machine learning and data mining library.
PigTM: advanced data flow language and parallel Computing execution Framework
SparkTM: a fast and general-purpose computing Hadoop data engine.
TezTM: a general data flow programming framework.
ZooKeeperTM: a high-performance coordinated service for distributed applications.
Hadoop version
The version of Hadoop is roughly divided into the following:
Apache
Official version
Cloudera (CDH)
Use the most downloaded version, stable, commercially supported, with some patches on the basis of Apache. Recommended.
HortonWorks (HDP)
The Apache-based version is integrated.
MapR
Hadoop module composition
Hadoop2 includes four modules.
Hadoop Common
The common utilities that support the other Hadoop modules.
Hadoop Distributed File System (HDFSTM)
A distributed file system that provides high-throughput access to application data.
Hadoop Yarn
A framework for job scheduling and cluster resource management.
Hadoop MapReduce
A YARN-based system for parallel processing of large data sets.
Introduction to Hadoop1 and Hadoop2
Hadoop1
HDFS:Hadoop Distributed File System distributed file system
MapReduce: distributed Computing Model
Hadoop2
HDFS2: Hadoop Distributed File System distributed file system
Yarn: resource management platform on which distributed computing is run. Typical computing models are
MapReduce, Storm, Spark, etc.
For details, please refer to http://hadoop.apache.org.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.