Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What if the Hive external partition table loads flume and the file on hdfs can't read the .tmp file?

2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces Hive external partition table loading flume to hdfs files can not read .tmp file how to do, the article is very detailed, has a certain reference value, interested friends must finish!

When flume calls hdfs, the file is generated according to the file size. Until the specified size is reached, the data is saved on the hdfs as a .tmp file, and the hive external table also loads these files, but when the file is completed, .tmp disappears, and hive will report an error that the file cannot be found. The solution is to write the pathfilter class of hive. When hive loads the data, filter out the tmp file and not load it.

The error message is as follows:

The custom PathFilter classes are as follows:

/ * @ Title: FileFilterExcludeTmpFiles.java * @ Description: when hive loads the partition table, the file of .tmp will be loaded, and this type of file will disappear after flume scrolls the data. If hive cannot find the file, it will report an error * this class will filter out the file of .tmp. Does not load into hive's partition table * @ version V0.1.0 * @ see * / public class FileFilterExcludeTmpFiles implements PathFilter {private static final Logger logger = LoggerFactory.getLogger (FileFilterExcludeTmpFiles.class) Public boolean accept (Path path) {/ / TODO Auto-generated method stub return! name.startsWith ("_") & &! name.startsWith (".") & &! name.endsWith (".tmp");}}

After writing it, upload it to the server in a jar package, and then modify the hive-site.xml file as follows:

Hive.aux.jars.path file:///usr/lib/mylib/FilterTmpPath.jar The location of the plugin jars that contain implementations of user defined functions and serdes. Mapred.input.pathFilter.class cn.utils.hive.FileFilterExcludeTmpFiles above is all the contents of the article "loading Hive external Partition Table flume to hdfs file can not read .tmp file", thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report