Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

0003-how to use LZO compression in CDH

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Warm Tip: to see the high-definition no-code picture, please open it with your mobile phone and click the picture to enlarge.

1. Problem description

Lzo compression encoding is not supported by default in CDH. Additional Parcel packages need to be downloaded to enable Hadoop related components such as HDFS,Hive,Spark to support Lzo encoding.

For details, please refer to:

Https://www.cloudera.com/documentation/enterprise/latest/topics/cm\_mc\_gpl\_extras.html

Https://www.cloudera.com/documentation/enterprise/latest/topics/cm\_ig\_install\_gpl\_extras.html#xd\_583c10bfdbd326ba-3ca24a24-13d80143249--7ec6

First of all, I generate the Lzo file and read it without additional configuration. We create two tables in Hive, test_table and test_table2,test_table are the text file tables, and test_table2 is the Lzo compressed table. As follows:

Create external table test_table (S1 string,s2 string) row format delimited fields terminated by'# 'location' / lilei/test_table'; insert into test_table values ('1' string,s2 string'), ('2' string,s2 string) row format delimited fields terminated by'# 'location' / lilei/test_table2'

Access Hive through beeline and execute the above command:

Query the data in test_table:

Insert the data from test_table into test_table2 and set the output file to lzo compression:

Set mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzoCodec;set hive.exec.compress.output=true;set mapreduce.output.fileoutputformat.compress=true;set mapreduce.output.fileoutputformat.compress.type=BLOCK; insert overwrite table test_table2 select * from test_table

The error message executed in Hive is as follows:

Error:Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)

8088 of Yarn can be found because the Lzo compression code cannot be found:

Compression codec com.hadoop.compression.lzo.LzoCodec was not found.

two。 Solution.

Configure the Parcel packet address of Lzo through the Parcel page of Cloudera Manager:

Note: if the cluster cannot access the public network, you need to download the Parcel package in advance and publish it to httpd

Download-> assign-> activate

Configure the compression encoding of HDFS to join Lzo:

Com.hadoop.compression.lzo.LzoCodeccom.hadoop.compression.lzo.LzopCodec

Save changes, deploy client configuration, and restart the entire cluster.

Wait for the restart to succeed:

Insert the data into test_table2 again and set it to Lzo encoding format:

Set mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzoCodec;set hive.exec.compress.output=true;set mapreduce.output.fileoutputformat.compress=true;set mapreduce.output.fileoutputformat.compress.type=BLOCK; insert overwrite table test_table2 select * from test_table

Inserted successfully:

2.1 Hive verification

First make sure that the file in test_table2 is in Lzo format:

Test in Hive's beeline:

The Hive compressed file based on Lzo works fine.

2.2 Spark SQL verification

Var textFile=sc.textFile ("hdfs://ip-172-31-8-141:8020/lilei/test_table2/000000_0.lzo_deflate") textFile.count () sqlContext.sql ("select * from test_table2")

The SparkSQL compressed file based on Lzo works fine.

Drunken whips are famous horses, and teenagers are so pompous! Lingnan Huan Xisha, under the vomiting liquor store! The best friend refuses to let go, the flower of data play!

Warm Tip: to see the high-definition no-code picture, please open it with your mobile phone and click the picture to enlarge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report