In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
Editor to share with you the example analysis of Map JOIN in Hive. I believe most people don't know much about it, so share this article for your reference. I hope you can learn a lot after reading this article. Let's learn about it together.
Map-side JOIN
Join on the map side is suitable for loading small tables into memory when a table is very small (it can be stored in memory). Hive supports automatic conversion to map-side join starting from 0.7. The configuration is as follows:
SET hive.auto.convert.join=true;-default true after hivev0.11.0
SET hive.mapjoin.smalltable.filesize=600000000;-the default is 25m
SET hive.auto.convert.join.noconditionaltask=true;-default true, so you do not need to specify map join hint
SET hive.auto.convert.join.noconditionaltask.size=10000000;-controls the size of tables loaded into memory
Once the join configuration on the map side is enabled, Hive automatically checks whether the small table is larger than the size configured by the hive.mapjoin.smalltable.filesize. If it is larger, it becomes a normal join, and if it is less than, it becomes a join on the map side.
The principle of map-side join is shown in the following figure:
First, Task A (task executed locally by the client) is responsible for reading small table a, converting it into a HashTable data structure, writing it to a local file, and then loading it into the distributed cache.
The Task B task then starts the map task to read the large table b, and in the Map phase, according to the hashtable association between each record and table an in the distributed cache, and outputs the result
Note: there are no reduce tasks on the map side of join, so map directly outputs the results, that is, how many map tasks will produce as many result files.
The above is all the contents of the article "sample Analysis of Map JOIN in Hive". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.