In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Official introduction to the use of UDF: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
Several related concepts of UDF:
UDF: one-to-one row mapping: upper substr [one line in, one line out]
UDAF: Aggregation Many-to-one row mapping, such as sum/min [in multiple lines and out of one line]
UDTF: Table-generating one-to-many for example: lateral view explode () [one to many]
Write the UDF function test code:
Pod.xml add hive:
1.1.0-cdh6.7.0 org.apache.hive hive-exec ${hive.version}
HelloUDF.java:
Package com.ruozedata.hadoop.udf;import org.apache.hadoop.hive.ql.exec.UDF;public class HelloUDF extends UDF {public String evaluate (String input) {/ / TODO... Here is the return "Hello:" + input;} / / the test code public static void main (String [] args) {HelloUDF udf = new HelloUDF (); String output = udf.evaluate ("test data"); System.out.println (output);} Note: the way to implement the UDF function is the same, the first step is to inherit the UDF function, and the second step is to rewrite the evaluate method
After being packaged with maven in idea, upload it to the hive server; the package name is: g6-hadoop-udf.jar
There are several ways for hive to create functions:
Method 1: create a temporary function (Temporary Functions)
Official reference: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateFunction
Cons: Temporary Functions is only valid for the current session (window)
Example: execute in Shell of Hive
ADD JAR / home/hadoop/lib/g6-hadoop-udf.jar
CREATE TEMPORARY FUNCTION sayHello AS 'com.ruozedata.hadoop.udf.HelloUDF'
Show functions; (execute this statement and you can see that sayHello is in the function)
Select sayhello ('abc') from dual; (output: Hello:abc)
Note: another drawback of this approach is that jar requires manual add each time to recognize class_name
Method 2: no manual add jar package is required
Create an auxlib directory under hive's home directory, and put the jar package in this directory.
Whether you create a temporary function or a persistent function, you don't need to load jar manually after you put it into auxlib
Method 3: create a persistent function (Permanent Functions) and use jar; on hdfs to suggest this way
Starting from hive 0.13, it is supported to register the function in metastore, and the stored table is FUNCS (empty by default)
Put the jar package in the / lib directory of hdfs
Example: execute the following command in Shell in Hive
CREATE FUNCTION sayhello2 AS 'com.ruozedata.hadoop.udf.HelloUDF' USING JAR' hdfs://ruozeclusterg6/lib/g6-hadoop-udf.jar'
Note: the sayhello2 function can be used in any window at this time (cannot be found using show functions, but can be seen in the FUNCS table of the metadata)
View the FUNCS table of the hive library in mysql; find that sayhello2 has been registered successfully
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.