Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Select From in SQL-Hive

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to use Select From in SQL-Hive, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.

01-query the contents of the table

Query a specified column or columns with the following command:

SELECT column 1, column 2,... FROM table name

When querying all fields in a table, you can use * to represent all fields. The asterisk (*) is a shortcut to select all columns. The command is as follows:

SELECT * FROM table name

If we want to query all the user_id and use_cnt in table t_od_use_cnt, the specific command is as follows:

SELECT user_id, use_cnt FROM app.t_od_use_cnt

Note: app is the name of the database. If the current query table is consistent with the current database, you can omit it.

02-Hive strict mode

Although it is grammatically correct (without partitioning) to write like this in Hive, it is likely to report an error in practice.

Because tables in Hive generally have a large amount of data, in order to prevent users from misoperating to scan the whole table, partitioning restrictions must be added when querying partitioned tables. For example, our partition field here is the date field date_8, and the table at work will require us to limit the partition data for which days to query. Here, we can enter the setting parameters for simulation. The command is as follows:

Hive (app) > set hive.mapred.mode; hive.mapred.mode=nonstrict hive (app) > set hive.mapred.mode=strict; hive (app) > set hive.mapred.mode; hive.mapred.mode=strict

Then we re-execute the above query statement with the following error:

Hive (app) > SELECT user_id >, use_cnt > FROM app.t_od_use_cnt; FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "t_od_use_cnt" Table "t_od_use_cnt" hive (app) >

This problem can be solved by using the where statement to limit the partition as follows:

SELECT user_id, use_cnt FROM app.t_od_use_cnt WHERE date_8 = '20210420 limit the number of rows returned by the query

The above query statement returns all the data queried, but sometimes we just need to confirm the contents of the data in the table, or to specify rows of data, for example, only 100 rows, we just need to add (limit number) after the query statement.

Query the first five rows of data in the table t_od_use_cnt with the following command:

SELECT user_id, use_cnt FROM app.t_od_use_cnt WHERE date_8 = 20190101 Limit 5

The running results are as follows:

Hive (app) > SELECT user_id >, use_cnt > FROM app.t_od_use_cnt > WHERE date_8 = 20210420 Limit 5; OK user_id use_cnt 10000 6 10001 49 10002 23 10003 1 10004 29 Time taken: 0.829 seconds, Fetched: 5 row (s) hive (app) > 04-column operation

Four operations such as addition, subtraction, multiplication, division and remainder can be performed on fields of numerical type when querying.

Let's multiply the use_cnt column and the is_active column in the table t_od_use_cnt to get a new column, and so on.

Hive (app) > SELECT user_id >, use_cnt >, is_active >, use_cnt * is_active > FROM app.t_od_use_cnt > WHERE date_8 = 20210420 Limit 5 OK user_id use_cnt is_active _ c3 10000 6 1 6 10001 49 1 49 10002 23 1 10003 10004 29 1 29 Time taken: 0.124 seconds, Fetched: 5 row (s) hive (app) > 05-column aliases

You can see that in the above example, we artificially create a new column by multiplying two columns, and the system names its column _ c3 by default. It is often necessary to give these newly generated lists an alias. The column name of an existing column can also be changed by aliasing if the meaning is not clear. However, aliases only take effect in this SQL statement and do not affect the field names in the original table.

By the way, the field naming rules are introduced here:

1. Cannot repeat an existing field

two。 Can only include lowercase letters (amurz), numbers (0-9), underscores (_)

3. Start with a letter

4. Underscore _ segmentation between words

Here we will alias as active_use_cnt and add AS active_use_cnt after the column. In addition, AS can be omitted, and aliases can be separated by spaces.

Hive (app) > SELECT user_id >, use_cnt >, is_active >, use_cnt * is_active AS active_use_cnt > FROM app.t_od_use_cnt > WHERE date_8 = 20210420 Limit 5 OK user_id use_cnt is_active active_use_cnt 10000 6 1 6 10001 49 1 49 10002 23 10003 10004 29 1 29 Time taken: 0.239 seconds, Fetched: 5 row (s) hive (app) > Thank you for reading this article carefully. I hope the article "how to use Select From in SQL-Hive" shared by the editor will be helpful to you. At the same time, I hope you will support and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report