In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly shows you "how to get started with SQL", the content is easy to understand, clear, hope to help you solve doubts, the following let the editor lead you to study and learn "how to get started with SQL" this article.
Two sample tables
In order to facilitate the practice of SQL syntax, two hive tables, ide_test.flights and ide_test.airports, are specially prepared. The table structure is like the following table building statement. Each of the other two tables has several rows of test data that can be used directly when practicing SQL.
Flights table
CREATE TABLE `ide_ test.flights` (`year`string COMMENT 'year', `month`string COMMENT' month', `day`string COMMENT 'day', `dep_ time`departure', `crs_dep_ time`string COMMENT 'planned departure time', `arr_ time`string COMMENT 'arrival time', `crs_arr_ time`string COMMENT', `carrier_ code`string COMMENT 'airline code', `flight_ num`flight number' `crs_elapsed_ time` string COMMENT 'actual string COMMENT', `air_ time` string COMMENT 'flight time', `arr_ delay`string COMMENT 'arrival delay', `dep_ delay`string COMMENT 'take-off delay', `destination `string COMMENT 'destination airport', `distance`string COMMENT 'distance) COMMENT' test data-flight information'
Airports table
CREATE TABLE `ide_ test.airports` (`name`string COMMENT 'name', `roomy`airports` country', `area_ code`string COMMENT' area_code', `code` string COMMENT 'code') COMMENT' Test data-Airport Information 'SQL example simple select query
Use the select statement to view the data in the table
Select usage
Select name from ide_test.airports limit 5-query the name column in the ide_test.airports table and return 5 rows of select * from ide_test.airports limit 5-use'*'to represent all the columns in the table
Special note: it is a good habit to limit the number of rows returned by SQL. For example, limit 5 above indicates that a maximum of 5 rows of data are returned.
Materials: in-depth understanding of Hive Select syntax
Use where statement to filter by condition
Using the where statement, you can swipe data according to specified conditions, such as viewing flight number 335
Where usage
Select *-returns all columns from ide_test.flightswhere flight_num= '335' in the table-return record limit 5 for flight_num column '335'
Tips: where is followed by a logical expression, and multiple logical expressions are connected by AND and OR
Use the distinct statement to arrange weights
A column (or more columns) often has duplicate values, and sometimes you need to check them later, which can be done using the distinct statement. For example, to see that the ide_ test. Flights table contains the data for those years, you can use the following SQL statement to do this:
Distinct usage
Select distinct year-- distinct will de-duplicate the value of the year column before returning from ide_test.flightslimit 10
Tips**:** has a better way to see which values are contained in a column. Here is just a demonstration of the use of the distinct statement.
Use order by statements to sort query results
Find out the five flights with the latest departure time
Order by usage
Select origin, dest, flight_num, dep_timefrom ide_test.flightsorder by dep_time desc-sort in reverse order by departure time, desc means sort from big to small, asc (or omit not write) means sort from smallest to largest, limit 5 uses sum/count/avg to calculate sum / count / average, etc.
Use the count () function to count. For example, to see how many flights are in the ide_test.flights table, you can write:
Count usage
Select count (flight_num) as cnt-- count function represents count from ide_test.flightslimit 10
Sum () means to sum according to the specified column, for example, to calculate how many kilometers Flight 335 has flown.
Sum usage
Select sum (cast (distance as int)) as total_distance-- sum function representation summation from ide_test.flightswhere flight_num = '335'limit 10
Tips: count/sum this kind of function is called aggregate function, and there are other aggregate functions, such as avg/max/min, the user calculates the average, maximum and minimum values, respectively.
Data: in-depth understanding of Hive function
Calculate by group using group by
The group by statement is of great use. It can first group the data and then calculate it. Just look at an example. For example, to calculate the mileage of each flight and take the top 10 according to the total number from large to small, write as follows:
Group by usage
Select flight_num, sum (distance) as total_distance-- except for aggregate functions such as sum/count/avg/min/max, columns explicitly written after select must also appear after group by-- grouped by flight number, summing the mileage of each group by order by total_distance desclimit 10
In another case, the having statement comes in handy when you query the flight number for which the total mileage is greater than 3000.
Having syntax
Select flight_num-the column written after select must also appear after group by from ide_test.flightsgroup by flight_numhaving sum (distance) > 3000-select flight limit 5 with a total mileage greater than 3000
Think about it: why use having as a filter in this case instead of where?
Materials: in-depth understanding of Hive Group by syntax
Subquery
To calculate the number of flights with total mileage greater than 3000, SQL can write
Subquery usage
Select count (1) as cnt from (select flight_num from ide_test.flights group by flight_num having sum (distance) > 3000) t
Tips: the use of subqueries is the same as that of ordinary hive tables. For clarity, subqueries are included with parentheses'()'.
Materials: in-depth understanding of Hive subquery
Use join to query data from multiple tables
Suppose we plan to query the mileage of each flight and the name of the starting airport, because the two information is stored in two different hive tables, so we need to query two hive tables at the same time to get the results, as follows
Join usage
Select p.flight_num,-- take flight number p.distance from flights table,-- take out flight mileage q.name from flights table-- name of airport departing from airports table from ide_test.flights pleft join ide_test.airports Q on p.origin = q.codelimit 10
Tips**:**
The left table of left join**:** is associated with the right table. When there is a value in the coordinates and the right table does not exist, the corresponding information of the right table is replaced by null.
Inner join**:** appears in the result only when there are corresponding values in both the left and right tables.
Right join**:** is the opposite of left join
Full outer join**:** use null instead of when there is no corresponding value in the left or right table
Note:
If there are m records in An and n records in B, when A join B, there will be at most m\ n* records. Think about why?
Data
In-depth understanding of Hive Join syntax
Using union all to splice multiple pieces of data
The UNION statement character is used to merge the result set of two or more SELECT statements.
Select flights with mileage greater than or less than 1000, and SQL is written as follows:
Note: this is just to demonstrate the use of union, obviously there should be a better way to write it, think for yourself:)
Union usage
Select flight_numfrom ide_test.flightswhere distance > 1000 union select flight_numfrom ide_test.flightswhere distance
< 100 Tips: 请注意,UNION 内部的 SELECT 语句必须拥有相同数量的列。列也必须拥有相似的数据类型。同时,每条 SELECT 语句中的列的顺序必须相同 资料:深入理解Hive Union语法 使用case when控制列输出 查询航班的飞行距离,按照距离分成"长、中、短"三个级别,可以借助case when语句来实现,示例写法: case when 用法 select flight_num,distance,case when distance >3000 then 'long-haul flight' when distance > 1000 then 'medium-distance flight' else 'short-range flight' end as flight_levelfrom ide_test.flightslimit 10 or above is all the content of this article "how to get started with SQL quickly". Thank you for reading! I believe you all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.