Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Select Fromsql in SQL Hive

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

Most people don't understand the knowledge points of this article "How to use Select Fromsql in SQL Hive", so Xiaobian summarizes the following contents for everyone. The contents are detailed, the steps are clear, and they have certain reference value. I hope everyone can gain something after reading this article. Let's take a look at this article "How to use Select Fromsql in SQL Hive".

Select Fromsql inside the most basic query statement, select*from followed by the table name, so that the query is all the data in this table, followed by where conditions can also be added to select, can also add Orderby sorting, can also add groupby grouping.

01-Query the contents of a table

To query a specified column or columns, the command is as follows:

SELECT column name 1, column name 2,…… FROM table name;

When querying all fields in a table, you can use represent all fields. The asterisk () is a shortcut to select all columns. The order reads as follows:

SELECT * FROM table name;

If we want to query table t_od_use_cnt for all user_id and use_cnt, the specific command is as follows:

SELECT user_id ,use_cntFROM app.t_od_use_cnt;

Note: app is the database name. If the current query table is consistent with the current database, it can be omitted.

02-Hive Strict Mode

Although this is syntactically correct in Hive (without partitions), it is likely to be wrong in practice.

Because the table in Hive generally has a large amount of data, in order to prevent the user from misoperating the full table scan, you can set the partition limit to be added when querying the partitioned table. For example, here our partition field is date_8, the date field. The table at work will require us to limit the number of days we have to query partition data. Here we can enter the setting parameters for simulation, the command is as follows:

hive (app)> set hive.mapred.mode;hive.mapred.mode=nonstricthive (app)> set hive.mapred.mode=strict;hive (app)> set hive.mapred.mode;hive.mapred.mode=strict

Then we re-execute the query above and report the following error:

hive (app)> SELECT user_id > ,use_cnt > FROM app.t_od_use_cnt;FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "t_od_use_cnt" Table "t_od_use_cnt"hive (app)>

This problem can be solved by using the where statement to qualify partitions as follows:

SELECT user_id ,use_cntFROM app. t_od_use_cntWHERE date_8 ='20210420 ';03-limit Limit the number of rows returned by a query

The above query statement will return all the data queried, but sometimes we just want to confirm the data content in the table, or to specify row data, such as long as 100 rows, then only need to add (limit number) after the query statement.

Query the first 5 rows of data in table t_od_use_cnt, and the command is as follows:

SELECT user_id ,use_cntFROM app.t_od_use_cntWHERE date_8 = 20190101 Limit 5;

The results are as follows:

hive (app)> SELECT user_id > ,use_cnt > FROM app.t_od_use_cnt > WHERE date_8 = 20210420 Limit 5;OKuser_id use_cnt10000 610001 4910002 2310003 110004 29Time taken: 0.829 seconds, Fetched: 5 row(s)hive (app)>04-Column four operations

Four operations such as addition, subtraction, multiplication, division and remainder can be performed on fields of numerical type during query.

Let's multiply the use_cnt column and the is_active column in table t_od_use_cnt to get a new column, and so on.

hive (app)> SELECT user_id > ,use_cnt > ,is_active > ,use_cnt * is_active > FROM app.t_od_use_cnt > WHERE date_8 = 20210420 Limit 5;OKuser_id use_cnt is_active_c310000 6 1 610001 49 1 4910002 23 1 2310003 1 0 010004 29 1 29Time taken: 0.124 seconds, Fetched: 5 row(s)hive (app)>05-Column Name

As you can see in the example above, we artificially create a new column by multiplying two columns, and the system defaults to_c3 for its column name. It is often necessary to give these new ones an alias. The column name of an existing column can also be changed by starting an alias if its meaning is unclear. However, aliases only take effect in this SQL statement and do not affect the field names in the original table.

Here's an introduction to field naming rules:

1. Cannot duplicate existing fields

2. Can only include lowercase letters (a-z), numbers (0-9), underscores (_)

3. start with a letter

4. Underline_divide between words

Here we start the alias as active_use_cnt and add AS active_use_cnt to the end of the column. Another AS can be omitted, only space separation alias can also take effect

hive (app)> SELECT user_id > ,use_cnt > ,is_active > ,use_cnt * is_active AS active_use_cnt > FROM app.t_od_use_cnt > WHERE date_8 = 20210420 Limit 5;OKuser_id use_cnt is_active active_use_cnt10000 6 1 610001 49 1 4910002 23 1 2310003 1 0 010004 29 1 29Time taken: 0.239 seconds, Fetched: 5 row(s)hive (app)> The above is about the content of this article "How to use Select Fromsql in SQL Hive". I believe everyone has a certain understanding. I hope that the content shared by Xiaobian will be helpful to everyone. If you want to know more related knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report