Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

A brief introduction to the Apache Hadoop of big data's Development of Biosphere

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Overview of Hadoop

Hadoop is a distributed system infrastructure developed by the Apache Foundation.

Users can develop distributed programs without knowing the underlying details of the distribution, and make full use of the power of the cluster for high-speed operation and storage.

Hadoop implements a distributed file system (Hadoop Distributed File System), referred to as HDFS. HDFS has high fault tolerance and is designed to be deployed on low-cost (low-cost) hardware; and it provides high throughput (high throughput) to access application data, suitable for applications with very large data sets (large data set).

The core design of Hadoop's framework is: HDFS and MapReduce. HDFS provides storage for massive data, while MapReduce provides computing for massive data.

Distributed storage in the distributed storage system, data scattered in different nodes may belong to the same file, in order to organize a large number of files, the files can be put into different folders, folders can be included at one level. We call this form of organization namespace. Namespaces manage all files in the entire server cluster. Distributed computing divides a problem that requires a lot of computing power into many small parts, then allocates these parts to many computers for processing, and finally synthesizes these calculation results to get the final result. Hadoop Associated Project

AmbariTM: an operation tool based on web that can provide resources, monitor and manage Hadoop clusters.

AvroTM: data serialization system.

HBaseTM: an extensible, distributed database that supports large table storage of structured data.

HiveTM: a data warehouse infrastructure that supports data summarization and temporary queries.

MahoutTM: an extensible machine learning and data mining library.

PigTM: advanced data flow language and parallel Computing execution Framework

SparkTM: a fast and general-purpose computing Hadoop data engine.

TezTM: a general data flow programming framework.

ZooKeeperTM: a high-performance coordinated service for distributed applications.

Hadoop version

The version of Hadoop is roughly divided into the following:

Apache

Official version

Cloudera (CDH)

Use the most downloaded version, stable, commercially supported, with some patches on the basis of Apache. Recommended.

HortonWorks (HDP)

The Apache-based version is integrated.

MapR

Hadoop module composition

Hadoop2 includes four modules.

Hadoop Common

The common utilities that support the other Hadoop modules.

Hadoop Distributed File System (HDFSTM)

A distributed file system that provides high-throughput access to application data.

Hadoop Yarn

A framework for job scheduling and cluster resource management.

Hadoop MapReduce

A YARN-based system for parallel processing of large data sets.

Introduction to Hadoop1 and Hadoop2

Hadoop1

HDFS:Hadoop Distributed File System distributed file system

MapReduce: distributed Computing Model

Hadoop2

HDFS2: Hadoop Distributed File System distributed file system

Yarn: resource management platform on which distributed computing is run. Typical computing models are

MapReduce, Storm, Spark, etc.

For details, please refer to http://hadoop.apache.org.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report