In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Big data solution and Business Application case Analysis based on Greenplum Hadoop distributed platform
Baidu online disk download: link: http://pan.baidu.com/s/1eQJFXZ0 password: kdx9
Download Baidu online disk: link: http://pan.baidu.com/s/1hq4cO2w password: cnsq
Course introduction:
This course is divided into two parts:
The first part gives a comprehensive and in-depth introduction of Greenplum database, including architecture features, deployment, management, development and tuning.
Combining theory with actual combat, let the students have a comprehensive and thorough grasp of this big data sword.
The second part deeply expounds the architecture principle of Hadoop and the overall technical architecture of Hadoop, including HBase, Hive, Pig, ZooKeeper,
Chukwa and other practical applications. In addition, it also introduces the basic knowledge of cloud computing and the application of Hadoop in the field of cloud computing, and analyzes the application of Hadoop in cloud computing.
The use of the business environment of various Internet giants.
[episode 1] the Foundation of Greenplum distributed Database (41 class hours)
1 Greenplum architecture
What is Greenplum?
Greenplum architecture
Greenplum High availability Architecture
2 install Greenplum
Configure the environment
Install and initialize the GPDB system
Start and stop database
Configure the GP system
3 distributed database storage
How data is stored
Distribution strategy
4 GBDB query processing
Execution of query command
SQL query processing mechanism
Parallel query plan
5 role permissions and client authentication management
Client authentication
Manage users and groups
6 client interfaces and programs
PgAdmin III
PSQL
7 define database objects
Create and manage databases
Create and manage tablespaces
Create and manage patterns
Create and manage tables
Partition table
Data distribution and partition
Compressed storage and row and column storage
Sequence, Index and View
8 manage data
Insert, update, delete records
Transaction management
Space recovery and statistics
9 query data
Define query
Use functions and operators
Query and analysis
10 workload and resource management
Overview of GP workload Management
Configure workload management
Create a resource queue
Allocate resource queue
Check resource queue status
11 load and unload data
Overview of GP mount command
Load data to GP
Unload data from GP
Format data file
12 backup recovery
Serial backup and recovery
Parallel recovery and recovery
13 performance tuning
How to tune
Common performance problems
14 GP system configuration parameters
On Master parameters and Localization parameters of GP
Set configuration parameters
Configuration parameter category
15 enable high availability
Overview of GP High availability
Open the Mirror of GP
Learn when Segment failed
Restore failed Segment
Restore failed Master
16 GP MapReduce
MapReduce Foundation
GP MapReduce programming
MapReduce job execution and troubleshooting
[part two] Hadoop distributed platform (55 class hours)
1 the origin and system of Hadoop
The origin of Hadoop thought: Google
Hadoop subproject family
Architecture of Hadoop
2 installation and configuration of Hadoop
Prepare and configure the environment
Three operation modes
Fully distributed mode installation
3 HDFS- big data Storage
The concept and Architecture of HDFS
Reliability of HDFS
HDFS file operation
HDFS API
4 about MapReduce
MapReduce programming model
Cluster behavior of MapReduce
Optimization of MapReduce tasks
MapReduce working mechanism
Error handling and job scheduling mechanism
5. MapReduce application development
Hadoop Eclipse plug-in development
Development of data screening program
Development of inverted index program
6 Hadoop monitoring and management
Page monitoring
Hadoop backup
7 HBase database
Hbase architecture
HBase shell
An example of HBase API application
HBase scene application
HBase pattern design
8 Hive data Warehouse
Hive components and Architecture
Hive installation configuration
Service interface of Hive
Common operations of HiveQL
Optimization of Hive
Hive UDF programming
Hive comprehensive actual combat
9 Pig data analysis platform
Pig framework
Pig installation configuration
The use of Pig
Data Model of Pig
Common Pig Latin operations
Pig UDF programming
Pig data Analysis practice
10 ZooKeeper distributed service framework
How ZooKeeper works
ZooKeeper design goal
Data structure and composition of ZooKeeper
Installation and configuration of ZooKeeper
ZooKeeper command line tool
ZooKeeper API
ZooKeeper practice: Hadoop task scheduling
11 Chukwa Cluster Monitoring system
The composition of Chukwa
Chukwa architecture and design
Chukwa installation and configuration
Common Chukwa commands
Implement custom data processing
12 Hadoop Business Application case
Cloud computing concepts and characteristics
Cloud computing service model and form
Application of Hadoop in Cloud Computing
JD.com Mall
Baidu
Alibaba
Tencent
13 Greenplum Hadoop cluster
Characteristics of the integrated architecture
Advantages of integrated architecture
Configure the gphdfs protocol usage environment
Use HDFS external tables
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.