Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Big data study road map

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Recently began big data's study, and defined a big data learning route for himself before learning.

Big data's Guide to Technical Learning Route

First, get started with Hadoop to understand what Hadoop is

1. Background of Hadoop production

2. The position and relationship of Hadoop in big data and Cloud Computing

3. Introduction of Hadoop application cases at home and abroad.

4. Analysis of the employment situation of domestic Hadoop and introduction of the curriculum syllabus.

5. Overview of distributed system

6. Brief introduction of Hadoop biosphere and its components.

7. Hadoop core MapReduce example

Second, distributed file system HDFS is a basic course for database administrators.

1. Brief introduction of distributed file system HDFS

2. Introduction to the system composition of HDFS

3. Detailed explanation of the components of HDFS

4. Copy storage policy and routing rules

5 、 NameNode Federation

6. Command line interface

7. Java interface

8. Explain the data flow between client and HDFS

9. Availability of HDFS (HA)

Third, junior MapReduce, becoming a basic course for Hadoop developers

1. How to understand the computing model of map and reduce

2. Analyze the execution process of MapReduce job under pseudo-distributed environment.

3. Yarn model

4. Serialization

5. The type and format of MapReduce

6. Build the MapReduce development environment

7. MapReduce application development

8. More examples, familiar with the principle of MapReduce algorithm

Advanced MapReduce, a key course for senior Hadoop developers

1. Use compression separation to reduce input size

2. Using Combiner to reduce intermediate data

3. Write Partitioner to optimize load balancing.

4. How to customize the collation

5. How to customize grouping rules

6. MapReduce optimization

7. Programming practice

Hadoop Cluster and Management is an advanced course for database administrators.

1. The construction of Hadoop cluster

2. Monitoring of Hadoop cluster

3. Management of Hadoop cluster

4. Run the MapReduce program under the cluster

Basic knowledge of ZooKeeper to build the basic framework of distributed system

1. ZooKeeper embodiment structure

2. Installation of ZooKeeper cluster

3. Operate ZooKeeper

Basic knowledge of HBase, column-oriented real-time distributed database

1. HBase definition

2. Comparison between HBase and RDBMS.

3. Data model

4. System architecture

5. MapReduce on HBase

6. the design of the table

VIII. HBase Cluster and its Management

1. Explain the process of building a cluster.

2. Cluster monitoring

3. Cluster management

IX. HBase client

1. HBase Shell and demo

2. Java client and code demo

Basic knowledge of Pig, another framework for Hadoop computing

1. Overview of Pig

2. Install Pig

3. Use Pig to complete mobile phone traffic statistics.

Hive, a Hadoop framework for computing using SQL

1. Basic knowledge of data warehouse

2. Hive definition

3. Brief introduction of Hive architecture

4. Hive cluster

5. Brief introduction of client

6. HiveQL definition

7. Comparison between HiveQL and SQL

8. Data type

9. The concept of table and table partition

10. Table operation and CLI client demonstration

11. Data import and CLI client presentation

12. Query data and CLI client presentation

13. Data connection and CLI client presentation

14. Development and demonstration of user-defined function (UDF)

12. The framework of data conversion between Sqoop,Hadoop and rdbms

1. Configure Sqoop

2. Import data from MySQL to HDFS using Sqoop

3. Use Sqoop to export data from HDFS to MySQL

XIII. Storm

1. Basic knowledge of Storm: including the basic concepts of Storm and Storm applications.

Scenario, architecture and fundamentals, comparison between Storm and Hadoop

2. Storm cluster building: describe in detail the installation of Storm cluster and common problems during installation.

3. Introduction of Storm components: spout, bolt, stream groupings, etc.

4. Reliability of Storm messages: retransmission of failed messages

5. The integration of Hadoop 2.0 and Storm: Storm on YARN

6. Storm programming practice

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report