In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
Editor to share with you what is the use of hive built-in functions, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!
Hive built-in functions: 1, user-defined function to deal with data; 2, used to solve the input one-line output of multiple rows [(On-to-many maping)] requirements; 3, user-defined aggregate function, operation of multiple data rows, resulting in a data row.
Hive built-in functions:
Definition:
UDF (User-Defined-Function), a user-defined function that processes the data.
UDTF (User-Defined Table-Generating Functions) is used to solve the requirement of input one line and output multiple lines (On-to-many maping).
UDAF (User Defined Aggregation Function) user-defined aggregate function that manipulates multiple data rows to produce a single data row.
Usage:
1. The UDF function can be directly applied to the select statement. After formatting the query structure, the content is output.
2. When writing UDF functions, you need to pay attention to the following points:
A) Custom UDF needs to inherit org.apache.hadoop.hive.ql.UDF.
B) need to implement the evaluate function.
C) the evaluate function supports overloading.
Local mode of hive:
Most Hadoop job requires the full extensibility provided by hadoop to deal with big data. However, sometimes the amount of input data in hive is very small. In this case, the time it takes to start a task for a query may be much longer than the actual job execution time. In most of these cases, hive can handle all tasks on a single machine in local mode. For small datasets, the execution time is significantly reduced.
In this way, operations with a small amount of data can be performed locally, which is much faster than submitting tasks to the cluster.
Configure the following parameters to enable the local mode of Hive:
Hive > set hive.exec.mode.local.auto=true; (default is false)
Local mode can only be used when a job meets the following conditions:
The input data size of 1.job must be less than the parameter: hive.exec.mode.local.auto.inputbytes.max (default 128MB)
The number of map of 2.job must be less than the parameter: hive.exec.mode.local.auto.tasks.max (default 4)
The number of reduce for 3.job must be 0 or 1
The above is all the content of the article "what is the use of hive built-in functions". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.