Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use the built-in function of hive

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you how to use hive built-in functions, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

Cli command

Show functions

Desc function concat

Desc function extended concat; looks at an example of how a function is used

Nvl function

Coalesce (v1 and v2.) Returns the first non-null value in the parameter, and returns null if all values are null

Set.cli.print.header=true

Winfunc

Employee wage identification

Id money type

The priority of relational operators is not and or.

And or priority

Select id, money from winfunc where id='1001' or id='1002'

And money = '100'

Result

1001 100

1001 150

1001 200

1001 150

1002 100

The correct sql should be

Select id, money from winfunc where (id='1001' or id='1002') and money = '100'

Result

1001 100

1002 100

If (con,v1,v2)

Select if (2 > 1) from dual

V1

Case when

Select case when id='1001' then 'v1' when id='1002' then' v2' else 'v3' end from winfunc

Get_json_object

Select get_json_object ('{"name": "jack", "age": "20"}','$.name') from dual

Jack

Parse_url

Select parse_url ('http://facebook.com/path2/p.php?k1=v1&k2=v2#Ref1',' HOST') from

Lxw_dual

Facebook.com

Select parse_url ('http://facebook.com/path2/p.php?k1=v1&k2=v2#Ref1',' QUERY', 'K1')

From lxw_dual

V1

Concat_ws has one more separator between concatenated strings than concat.

Concat_ws (string SEP,array) handles values in an array

Collect_set (id) to return to the array

Select collect_set (id) from winfunc

["1001", "1002", "1003", "1004"]

Collect_list (id) does not return to the array

Select collect_list (id) from winfunc

The partition by keyword is part of the analytical function in oracle. It differs from the aggregate function in that it can return multiple records in a group, while the aggregate function generally has only one record that reflects the statistical value.

Sum () over (PARTITION BY...) Is an analytical function. The effect of his execution is the same as that of an ordinary sum. Group by. Unlike, it calculates the cumulative sum of expressions in a group, rather than a simple sum.

Group By and Having, Where, Order by keywords are executed in the following order: Where, Group By, Having, Order by.

Of these four keywords, the column name of the final view can be used only in the Order By statement, such as:

SELECT FruitName, ProductPlace, Price, ID AS IDE, Discount

FROM T_TEST_FRUITINFO

WHERE (ProductPlace = Nissan')

ORDER BY IDE

IDE can only be used in ORDER BY statements, and ID can only be used in other conditional statements if column names need to be referenced, not IDE.

Columns in the ORDER BY clause must be included in the aggregate function or GROUP BY clause.

When GROUP BY and ORDER BY are used together, ORDER BY comes after GROUP BY.

1. Window function

First_value (find the first value of the group)

Select id,money

First_value (money) over (partition by id order by money

Rows between 1 preceding and 1 following)

From winfunc

The data window corresponding to each row is from the first row to the last row.

Rows between unbounded preceding and unbounded following

Lead (money,2) takes the record value whose trailing distance is 2. If not, take null.

Select id,money,lead (money,2) over (order by money) from winfunc

Lag (money,2) is the opposite of lead

Rank () sorting function and row_number ()

Select id,money, rank () over (partition by id order by money) from winfunc

Result

1001 1001

1001 150 2

1001 150 2

1001 200 4

Dense_rank ()

Select id,money, dense_rank () over (partition by id order by money) from winfunc

Result

1001 1001

1001 150 2

1001 150 2

1001 200 3

Cume_dist ()

Calculation formula: the number of rows whose CUME_DIST is less than or equal to the current value / the total number of rows in the group-for example, the proportion of the total number of people less than or equal to the current salary

Select id,money, cume_dist () over (partition by id order by money) from winfunc

Result

1001 100 0.25

1001 150 0.75

1001 150 0.75

1001 200 1

Percent_rank (), the first one always starts from scratch

PERCENT_RANK () = (RANK ()-1) / (Total Rows-1)

Calculation formula: (same value minimum line number-1) / (total number of lines-1)

Result

1001 100 0

1001 150 0.33

1001 150 0.33

1001 200 1

Ntile (2) fragmentation

When asc, nulls last is the default

When desc, nulls first is the default

Select id,money, ntile (2) over (order by money desc nulls last) from winfunc

Mixed functions (using methods in java)

Java_method and reflect are the same.

Select java_method ("java.lang.Math", "sqrt", cast (id as double)) from winfunc

UDTF table function explode () with lateral view keyword

Select id, adid from winfunc lateral view explode (split (type,'B')) tt as adid

1001 ABC

Column wrapping

1001 A

1001 C

Regular expression function

The like character "_" represents any single character, while the character "%" represents any number of characters

Rlike followed by regular expression

Select 1 from dual where 'footbar' rlike' ^ f.roomringing'

Regular expression replacement function

Regexp_replace (string A ~ (1) string B ~ (1) string C)

Replace the part of string A that conforms to java regular expression B with C

Select regexp_replace ('foobar','oo | ar','') from dual

Return to fb

Regexp_extract (string subject,string pattern,int index)

Select regexp_extract ('foothebar','foo (. *?) (bar)', 1) from dual

Returns the, () the regular expression represents the group, and 1 represents the index of the first group

1. Greedy match (. *), all the way to the last one |

Select regexp_extract ('979 | 7.10.80 | 8684 from dual. *\ | (. *)', 1)

Return 8684

two。 Non-greedy matching (. *?) Add a question mark to tell the regular engine to repeat the last character as little as possible

Select regexp_extract ('979 | 7.10.80 | 8684' (. *?)\\ | (. *)', 1) from dual

The above is all the contents of the article "how to use hive built-in functions". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 223

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report