In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly shows you the "what are the hive file storage formats", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "what are the hive file storage formats" this article.
Test comparison of Hive File Storage format
1.textfile
Textfile is the default format
Storage method: row storage
Disk overhead big data is expensive to parse
The compressed text file hive cannot be merged and split
2.sequencefile
Binary files, serialized into files in the form of
Storage method: row storage
Divisible compression
Generally choose block compression
The advantage is that the file and the mapfile in hadoop api are compatible.
3.rcfile
Storage method: data is stored in rows, blocks, and columns.
Compressed fast column access
Reading records involves as few block as possible.
You only need to read the header definition of each row group to read the required columns.
The operational performance of reading full data may not have obvious advantages over sequencefile.
4.orc
Storage method: data is stored in rows, blocks, and columns.
Compressed fast column access
It is more efficient than rcfile and is an improved version of rcfile
5. Custom format
Users can customize the input and output format by implementing inputformat and outputformat.
Summary:
Textfile consumes a lot of storage space, and compressed text can not split and merge queries with the lowest efficiency, so it can be stored directly, and the speed of loading data is the highest.
Sequencefile consumes the most storage space, and compressed files can be split and merged with high query efficiency, so it needs to be loaded through text file conversion.
Rcfile has the smallest storage space and the highest query efficiency. It needs to be loaded through the conversion of text files, and the loading speed is the lowest.
Personal advice: if you can not use text,seqfile, try not to use it. It is best to choose orc.
The above is all the contents of the article "what are the hive file storage formats?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.