Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the tips commonly used in hive

2025-04-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article will explain in detail what are the common tips about hive. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

1. Parse_url will return NULL if it cannot be found.

Parse_url is used to parse the data in url. Parsing HOST and QUERY is commonly used.

String

Parse_url (string urlString, string partToExtract [, string keyToExtract])

Returns the specified part from the URL. Valid values for partToExtract include HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and USERINFO. E.g. Parse_url ('http://facebook.com/path2/p.php?k1=v1&k2=v2#Ref1',' HOST') returns' facebook.com'. Also a value of a particular key in QUERY can be extracted by providing the key as the third argument, e.g. Parse_url ('http://facebook.com/path2/p.php?k1=v1&k2=v2#Ref1',' QUERY', 'K1') returns' v1.

Select parse_url ('http://www.meilishuo.com/guang/hot','QUERY','page') from class_method_map where parse_url (' http://www.meilishuo.com/guang/hot','QUERY','page'))

> select sessidmodex (sessid,10), count (*), count (distinct sessid), count (distinct visitip) from visitlogs where ((dt='2012-11-10 'and vhour > = 13) or (dt='2012-11-11' and vhour AND ((class_name='goods' AND method_name='goods_poster' and uri like'% page=0%') OR (class_name='goods' AND method_name='hot' and parse_url (concat ('http://www.meilishuo.com',uri),'QUERY',)) 'page') is NULL)) AND not is_spam (dt,sessid,'SESSID') group by sessidmodex (sessid, 10)

FAILED: Hive Internal Error: java.lang.NullPointerException (null)

Java.lang.NullPointerException

At org.apache.hadoop.hive.ql.optimizer.pcr.PcrExprProcFactory.opAnd (PcrExprProcFactory.java:128)

At org.apache.hadoop.hive.ql.optimizer.pcr.PcrExprProcFactory$GenericFuncExprProcessor.process (PcrExprProcFactory.java:267)

At org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch (DefaultRuleDispatcher.java:89)

At org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch (DefaultGraphWalker.java:88)

At org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk (DefaultGraphWalker.java:125)

At org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking (DefaultGraphWalker.java:102)

At org.apache.hadoop.hive.ql.optimizer.pcr.PcrExprProcFactory.walkExprTree (PcrExprProcFactory.java:450)

At org.apache.hadoop.hive.ql.optimizer.pcr.PcrOpProcFactory$FilterPCR.process (PcrOpProcFactory.java:149)

At org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch (DefaultRuleDispatcher.java:89)

At org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch (DefaultGraphWalker.java:88)

At org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk (DefaultGraphWalker.java:125)

At org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking (DefaultGraphWalker.java:102)

At org.apache.hadoop.hive.ql.optimizer.pcr.PartitionConditionRemover.transform (PartitionConditionRemover.java:78)

At org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize (Optimizer.java:87)

At org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal (SemanticAnalyzer.java:7306)

At org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze (BaseSemanticAnalyzer.java:243)

At org.apache.hadoop.hive.ql.Driver.compile (Driver.java:430)

At org.apache.hadoop.hive.ql.Driver.compile (Driver.java:337)

At org.apache.hadoop.hive.ql.Driver.run (Driver.java:889)

At org.apache.hadoop.hive.cli.CliDriver.processLocalCmd (CliDriver.java:255)

At org.apache.hadoop.hive.cli.CliDriver.processCmd (CliDriver.java:212)

At org.apache.hadoop.hive.cli.CliDriver.processLine (CliDriver.java:403)

At org.apache.hadoop.hive.cli.CliDriver.run (CliDriver.java:671)

At org.apache.hadoop.hive.cli.CliDriver.main (CliDriver.java:554)

At sun.reflect.NativeMethodAccessorImpl.invoke0 (NativeMethod)

At sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:57)

At sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)

At java.lang.reflect.Method.invoke (Method.java:616)

At org.apache.hadoop.util.RunJar.main (RunJar.java:156)

3. Common methods of Array and Map:

A [n] An is an Array and n is an int Returns the nth element in the array A. The first element has index 0 e.g. If An is an array comprising of ['foo',' bar'] then A [0] returns' foo' and A [1] returns' bar' M [key] M is a Map and key has type K Returns the value corresponding to the key in the map e.g. If M is a map comprising of {'f'- > 'foo', 'b'->' bar'' 'all'->' foobar'} then M ['all'] returns' foobar' array map_keys (Map) Returns an unordered array containing the keys of the input map array map_values (Map) Returns an unordered array containing the values of the input map boolean array_contains (Array, value) Returns TRUE if the array contains value

Simple demo: query the records of all users who have been through the qzone channel

Select * from user_session_stat where dt='2012-10-15 'and array_contains (map_keys (market_from),' tx_qzone') limit 10

Visitips format {"172.0.0.1": 100,172.0.0.1 ": 20," 172.0.0.2 ": 5} # IP: traffic, sorted by traffic

Map_keys (visitips) [0]: get the most visited IP

4. Explode scatter array and dictionary

Select explode (map_keys (market_from)) as cc from user_session_stat where dt='2012-11-28' and size (map_keys (market_from)) > 1 limit 2

Select tmp.cc,count (*) from (select explode (map_keys (market_from)) as cc from user_session_stat where dt='2012-11-28' and size (map_keys (market_from)) > 1 limit 10) tmp group by cc

Details: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView

The first interval: 9.8 to 9.18

Second interval: 10.2 to 10.12

The third interval: 10.21mm 11.01

Interval 4: 11.07 to 11.17

Select tmp2.cc,count (*) from (

Select explode (map_keys (uss.market_from)) as cc from

Select sessid from user_session_stat where dt='2012-09-18 'and is_spam=0 and sub_channel='norefer') tmp join user_session_stat uss on tmp.sessid=uss.sessid where uss.dt > =' 2012-09-08 'and uss.dt SELECT get_json_object (src_json.json,' $.owner') FROM src_json; amy hive > SELECT get_json_object (src_json.json,'$.store.owner\ [0]') FROM src_json {"weight": 8, "type": "apple"} hive > SELECT get_json_object (src_json.json,'$.non _ exist_key') FROM src_json This is the end of NULL's article on "what are the common tips for hive". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report