In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
1. General situation of Hadoop ecology.
Hadoop is a distributed system integration architecture developed by the Apache Foundation. Users can develop distributed programs without knowing the underlying details of the distribution, and make full use of the power of the cluster for high-speed computing and storage. It is reliable, efficient and scalable.
The core of Hadoop is YARN,HDFS,Mapreduce, and the common module architecture is as follows
2 、 HDFS
From Google's GFS paper, published in October 2013, HDFS is a clone of GFS. HDFS is the basis of data storage management in Hadoop. It is a highly fault-tolerant system that can detect and respond to hardware failures.
HDFS simplifies the file consistency model and provides high-throughput application data access through streaming data access. It is suitable for applications with large datasets. It provides a mechanism to write once and read multiple times. The data is in the form of blocks and is distributed in different physical machines of the cluster at the same time.
3 、 Mapreduce
Derived from Google's MapReduce paper, which is used for computing with a large amount of data, it shields the details of the distributed computing framework and abstracts computing into two parts: map and reduce.
4. HBASE (distributed inventory database)
The Bigtable paper, from Google, is a column-oriented, scalable, highly reliable, high-performance, distributed and column-oriented dynamic schema database based on HDFS.
5 、 zookeeper
Solve the problem of data management in distributed environment, such as unified naming, state synchronization, cluster management, configuration synchronization, etc.
6 、 HIVE
Open source by Facebook, a query language similar to sql is defined, which converts SQL to mapreduce tasks and executes on Hadoop
7 、 flume
Log collection tool
8. Yarn distributed Resource Manager
Is the next generation of mapreduce, mainly to solve the poor scalability of the original Hadoop, does not support a variety of computing frameworks, the architecture is as follows
The concepts of big data and artificial intelligence are vague. What route to learn and where to develop after learning. Students who want to learn are welcome to join big data's learning skirt: 606859705. There are a lot of practical information (zero foundation and advanced classic actual combat) to share with you, so that you can understand the most complete big data high-end practical learning process system in China. Start with java and linux, and then gradually go deep into HADOOP-hive-oozie-web-flume-python-hbase-kafka-scala-SPARK and other related knowledge to share!
9 、 spark
Spark provides a faster and more general data processing platform. Compared with Hadoop, spark allows your program to run in memory.
10 、 kafka
Distributed message queues, mainly used to process active streaming data
11. Hadoop pseudo-distributed deployment
At present, there are three main free Hadoop versions, all of which are foreign manufacturers.
1. Original version of Apache
2. CDH version, for domestic users, the vast majority choose this version
3. HDP version
Here we choose the CDH version of hadoop-2.6.0-cdh6.8.2.tar.gz. The environment is that CentOS7.1,jdk needs more than 1.7.0 to 55.
[root@hadoop1 ~] # useradd hadoop
The default java environment that comes with my system is as follows
Add the following environment variables
Do the following authorization
Here, various services of Hadoop are managed and started by Hadoop users.
View service startup
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.