Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Detailed explanation of Hive mixed function UDTF UDF UDAF

2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Mixed functions can use the method java_method in java (class,method [, arg1 [, arg2...]]) Or reflect.

Hive version 1.2.1

UDTF user-defined table function (table function) one line becomes multi-line fit lateral view

Lateral view of hive

Http://blog.sina.com.cn/s/blog_7e04e0d00101csic.html

UDF overrides evaluate method Map side

Import org.apache.hadoop.hive.ql.exec.UDF;import org.apache.hadoop.io.Text;public class udftest extends UDF {public boolean evaluate (Text T1 Magi text T2) {if (t1==null | | t2==null) {return false;} double d1=Double.parseDouble (t1.toString ()); double d2=Double.parseDouble (t2.toString ()); if (D1 > D2) {return true } else {return false;}

Functions are packaged into function.jar

Hive Command Line

The add jar/ home/jar/function.jar / / jar package enters the distributed cache create temporary function bigthan as' com.peixun.udf.udftest'// and executes the creation template function bigthan

UDAF (user defined aggregation function) user-defined aggregate function

Custom UDAF statistics of the number of records with a b field greater than 30 countbigthan (bmem30) implementation code

Import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;import org.apache.hadoop.hive.ql.metadata.HiveException;import org.apache.hadoop.hive.ql.parse.SemanticException;import org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver;import org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory Import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils;import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;import org.apache.hadoop.io.LongWritable / / inheritance type checking class public class udaftest extends AbstractGenericUDAFResolver {/ / number of parameters @ Override public GenericUDAFEvaluator getEvaluator (TypeInfo [] parameters) throws SemanticException {if (parameters.length! = 2) {throw new UDFArgumentTypeException (parameters.length-1, "Exactly two argument is expected");} return new GenericUDAFCountBigThanEvaluator () / / return processing logic class} / / processing logic class public static class GenericUDAFCountBigThanEvaluator extends GenericUDAFEvaluator {private LongWritable result; private PrimitiveObjectInspector inputOI1; private PrimitiveObjectInspector inputOI2 / / init method must be executed in map,reduce phase / / map phase parameters length is related to the number of parameters input by UDAF / / reduce phase, parameters length is 1 @ Override public ObjectInspector init (Mode m, ObjectInspector [] parameters) throws HiveException {result = new LongWritable (0); inputOI1 = (PrimitiveObjectInspector) parameters [0] If (parameters.length > 1) {inputOI2 = (PrimitiveObjectInspector) parameters [1];} return PrimitiveObjectInspectorFactory.writableLongObjectInspector; / / final result return type} @ Override public AggregationBuffer getNewAggregationBuffer () throws HiveException {CountAgg agg = new CountAgg (); / / stores part of the aggregate value reset (agg) Return agg;} / / Cache object initialization @ Override public void reset (AggregationBuffer agg) throws HiveException {CountAgg countagg = (CountAgg) agg; countagg.count = 0 } / / specific logic / / iterate only operates on the map side @ Override public void iterate (AggregationBuffer agg, Object [] parameters) throws HiveException {assert (parameters.length = = 2); if (parameters = = null | | parameters [0] = = null | | parameters [1] = = null) {return } double base = PrimitiveObjectInspectorUtils.getDouble (parameters [0], inputOI1); double tmp = PrimitiveObjectInspectorUtils.getDouble (parameters [1], inputOI2); if (base > tmp) {((CountAgg) agg) .count++ }} / / partial results returned in map phase @ Override public Object terminatePartial (AggregationBuffer agg) throws HiveException {result.set (CountAgg) agg) .count); return result } / / merge partial results map (including Combiner) and reduce are executed, and parial passes some results obtained by terminatePartial @ Override public void merge (AggregationBuffer agg, Object partial) throws HiveException {if (partial! = null) {long p = PrimitiveObjectInspectorUtils.getLong (partial, inputOI1) ((CountAgg) agg) .count + = p;} @ Override public Object terminate (AggregationBuffer agg) throws HiveException {result.set (CountAgg) agg) .count); return result;} public class CountAgg implements AggregationBuffer {long count;}}

Three methods of registering permanent functions by hive

The $HOME/.hiverc file is executed by default every time hive shell starts.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report