Example of 0010-Hive multi-delimiter support 07/13 Update SLTechnology News&Howtos

Example of 0010-Hive multi-delimiter support

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Warm Tip: to see the high-definition no-code picture, please open it with your mobile phone and click the picture to enlarge.

1. Problem description

How to load a data file with multiple characters as field separators into the Hive table, the case data is as follows:

The field delimiter is "@ # $"

Test1@#$test1name@#$test2valuetest2@#$test2name@#$test2valuetest3@#$test3name@#$test4value

How to load the above case data into the Hive table (multi_delimiter_test), the table structure is as follows:

Field name Field Type s1Strings2Strings3String

2.Hive multi-delimiter support

Hive supports multiple delimiters for fields in version 0.14 and later, refer to https://cwiki.apache.org/confluence/display/Hive/MultiDelimitSerDe

3. Mode of realization

Test environment description

The test environment is CDH5.11.1Hive version 1.1.0 operating system RedHat6.5

Operation steps

1. Prepare the multi-delimiter file and load it to the HDFS corresponding directory

[ec2-user@ip-172-31-8-141141] $cat multi_delimiter_test.dattest1@#$test1name@#$test2valuetest2@#$test2name@#$test2valuetest3@#$test3name@#$test4value [ec2-user@ip-172-31-8-141141] $hadoop dfs-put multi_delimiter_test.dat / fayson/multi_delimiter_ Testt [EC2-user@ip-172-31-8-141141] $hadoop dfs-ls / fayson/multi_delimiter_testDEPRECATED: Use of this script to execute hdfs command is deprecated .Instead use the hdfs command for it.Found 1 items-rw-r--r-- 3 user_r supergroup 93 2017-08-23 03:24 / fayson/multi_delimiter_test/multi_delimiter_ test.data [EC2-user@ip-172-31-8-141l] $

two。 Build a table based on a prepared multi-delimiter file

Create external table multi_delimiter_test (S1 string,s2 string,s3 string) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim" = "@ # $") stored as textfile location' / fayson/multi_delimiter_test'

3. test

2: jdbc:hive2://localhost:10000/default > select count (*) from multi_delimiter_test;INFO: Ended Job = job_1503469952834_0006INFO: MapReduce Jobs Launched:INFO: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 3.25 sec HDFS Read: 6755 HDFS Write: 2 SUCCESSINFO: Total MapReduce CPU Time Spent: 3 seconds 250 msecINFO: Completed executing command (queryId=hive_20170823041818_ce58aae2-e6db-4eed-b6af-652235a6e66a) Time taken: 33.286 secondsINFO: OK+-+--+ | _ c0 | +-+-+ | 3 | +-+-- + 1 row selected (33.679 seconds) 2: jdbc:hive2://localhost:10000/def

4. common problem

1. Error executing count query Times

Exception log

Execute count query via beeline Times error

2: jdbc:hive2://localhost:10000/default > select count (*) from multi_delimiter_test;INFO: Compiling command (queryId=hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97): select count (*) from multi_delimiter_testINFO: Semantic Analysis CompletedINFO: Returning Hive schema: Schema (fieldSchemas: [field Schema (name:_c0, type:bigint, comment:null)], properties:null) INFO: Completed compiling command (queryId=hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97) Time taken: 0.291 secondsINFO: Executing command (queryId=hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97): select count (*) from multi_delimiter_testINFO: Query ID = hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97INFO: Total jobs = 1INFO: Launching Job 1 out of 1INFO: Starting task [Stage-1:MAPRED] in serial modeINFO: Number of reduce tasks determined at compile time: 1INFO: In order to change the average load for a reducer (in bytes): INFO: Set hive.exec.reducers.bytes.per.reducer=INFO: In order to limit the maximum number of reducers:INFO: set hive.exec.reducers.max=INFO: In order to set a constant number of reducers:INFO: set mapreduce.job.reduces=INFO: number of splits:1INFO: Submitting tokens for job: job_1503469952834_0002INFO: Kind: HDFS_DELEGATION_TOKEN Service: ha-hdfs:nameservice1, Ident: (token for hive: HDFS_DELEGATION_TOKEN owner=hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM, renewer=yarn, realUser=, issueDate=1503475160778, maxDate=1504079960778, sequenceNumber=27, masterKeyId=9) INFO: The url to track the job: http://ip-172-31-9-186.ap-southeast-1.compute.internal:8088/proxy/application_1503469952834_0002/INFO: Starting Job = job_1503469952834_0002 Tracking URL = http://ip-172-31-9-186.ap-southeast-1.compute.internal:8088/proxy/application_1503469952834_0002/INFO: Kill Command = / opt/cloudera/parcels/CDH-5.10.2-1.cdh6.10.2.p0.5/lib/hadoop/bin/hadoop job-kill job_1503469952834_0002INFO: Hadoop job information for Stage-1: number of mappers: 1 Number of reducers: 1INFO: 2017-08-23 03 Stage-1 map = 0%, reduce = 0%INFO: 2017-08-23 04 Stage-1 map = 100%, reduce = 100%ERROR: Ended Job = job_1503469952834_0002 with errorsERROR: FAILED: Execution Error Return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTaskINFO: MapReduce Jobs Launched:INFO: Stage-Stage-1: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAILINFO: Total MapReduce CPU Time Spent: 0 msecINFO: Completed executing command (queryId=hive_20170823035959_f1b11a9b-757d-4d9b-b8a7-6d4ab1c00a97) Time taken: 48.737 secondsError: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)

The error in the shell operation using Hive is as follows

Error: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf (ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf (ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance (ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapTask.runOldMapper (MapTask.java:449) at org.apache.hadoop.mapred.MapTask.run (MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run (YarnChild.java:164) at java.security.AccessController.doPrivileged (Native Method) at javax.security.auth.Subject.doAs (Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs (UserGroupInformation.java:1920) at org.apache.hadoop.mapred.YarnChild.main (YarnChild.java:158) Caused by: java.lang .reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0 (NativeMethod) at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke (Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf (ReflectionUtils.java:106)... 9 moreCaused by: java.lang.RuntimeException: Error In configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf (ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf (ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance (ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapRunner.configure (MapRunner.java:38)... 14 moreCaused by: java.lang.reflect.InvocationTargetException at sun.reflect NativeMethodAccessorImpl.invoke0 (NativeMethod) at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke (Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf (ReflectionUtils.java:106)... 17 moreCaused by: java.lang.RuntimeException: Map operator initialization failed at org.apache. Hadoop.hive.ql.exec.mr.ExecMapper.configure (ExecMapper.java:147)... 22 moreCaused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI (MapOperator.java:323) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren (MapOperator.java: 333) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure (ExecMapper.java:116)... 22 moreCaused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName (Configuration.java:2105) at org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer (PartitionDesc.java:140) at org .apache.hadoop.hive.ql.exec.MapOperator.getConfigurdOI (MapOperator.java:297)... 24 moreFAILED: Execution Error Cause Analysis of return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTaskMapReduce Jobs Launched:Stage-Stage-1: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAILTotal MapReduce CPU Time Spent: 0 ms

The org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe class is in the hive-contrib.jar package.

When performing non-aggregate operation queries, sql can execute normally, but when performing aggregation function operations, there is an error, indicating that the jar dependency package is missing when executing MapReduce tasks; MapReduce belongs to yarn jobs, so the yarn runtime environment lacks hive-contrib.jar dependency packages.

Solution method

In all nodes of the CDH cluster, copy the hive-contrib-1.1.0-cdh6.10.2.jar package to the lib directory of yarn

Sudo scp-r / opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib-1.1.0-cdh6.10.2.jar / opt/cloudera/parcels/CDH/lib/hadoop-yarn/lib/

Rerun the count statement and execute successfully

Drunken whips are famous horses, and teenagers are so pompous! Lingnan Huan Xisha, under the vomiting liquor store! The best friend refuses to let go, the flower of data play!

Warm Tip: to see the high-definition no-code picture, please open it with your mobile phone and click the picture to enlarge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.