In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the knowledge of "how to use the aggregate function in MySQL". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
MySQL aggregate function and paging query
We learned about the SQL one-line function earlier. In fact, there is another class of SQL functions, called aggregation (or aggregation, grouping) functions, which are functions that summarize a set of data, input a set of data and output a single value.
1. Introduction to aggregate function
What is an aggregate function
The aggregate function acts on a set of data and returns a value for a set of data.
Aggregate function type
AVG ()
SUM ()
MAX ()
MIN ()
COUNT ()
Aggregate function syntax
Aggregate functions cannot be nested calls
For example, a call like "AVG (SUM)" cannot occur.
1.1 AVG and SUM functions
You can use the AVG and SUM functions for numeric data.
SELECT AVG (salary), MAX (salary), MIN (salary), SUM (salary) FROM employeesWHERE job_id LIKE'% REP%';1.2 MIN and MAX functions
You can use the MIN and MAX functions for data of any data type.
SELECT MIN (hire_date), MAX (hire_date) FROM employees;1.3 COUNT function
COUNT (*) returns the total number of records in the table, which applies to any data type.
SELECT COUNT (*) FROM employeesWHERE department_id = 50
COUNT (expr) returns the total number of records whose expr is not empty.
SELECT COUNT (commission_pct) FROM employeesWHERE department_id = 50; / / Null value ignored
How many records are there in the calculation table
Mode 1:count (*)
Mode 2:count (1)
Method 3:count (a specific field), but because the null value is ignored, it is not necessarily true to the
Question: who should use count (*), count (1) or count (list)?
In fact, there is no difference between the tables of the MyISAM engine. There is a counter inside the engine that maintains the number of rows, but COUNT (*) is slightly more efficient
The table of the Innodb engine uses count (*), count (1) to read lines directly, and the complexity is O (n), because innodb really has to count it. But better than the specific count (column name).
Question: can I replace count (*) with count (column name)?
Do not use count (column name) instead of count (*). Count (*) is the standard syntax for counting rows defined by SQL92, independent of database, NULL and non-NULL.
Note: count (*) will count rows with a value of NULL, while count (column name) will not count rows with NULL values for this column.
Note:
All the above grouping functions ignore null values.
It can be matched with distinct to realize the de-duplication operation.
A separate introduction to the count function, generally using count (*) to count the number of rows
The fields queried with the grouping function are required to be fields after group by
2. GROUP BY
2.1 basic use
You can use the GROUP BY clause to divide the data in a table into groups, with the following syntax:
SELECT column, group_function (column) FROM table [WHERE condition] [GROUP BY group_by_expression] [ORDER BY column]
Be clear: WHERE must be placed after FROM
1. All columns in the SELECT list that are not included in the group function should be included in the GROUP BY clause
SELECT department_id, AVG (salary) FROM employeesGROUP BY department_id
2. Columns contained in the GROUP BY clause do not have to be included in the SELECT list
SELECT AVG (salary) FROM employeesGROUP BY department_id
2.2 grouping with multiple columns
SELECT department_id AS dept_id, job_id, SUM (salary) FROM employeesGROUP BY department_id, job_id
2.3 use of WITH ROLLUP in GROUP BY
After using the WITH ROLLUP keyword, a record is added after all the queried grouped records, which calculates the sum of all the queried records, that is, the number of records.
SELECT department_id,AVG (salary) FROM employeesWHERE department_id > 80GROUP BY department_id WITH ROLLUP
Note: when using ROLLUP, you cannot use the ORDER BY clause to sort the results at the same time, that is, ROLLUP and ORDER BY are mutually exclusive. Of course, this only exists at 5.7.
3. HAVING (filtering data)
3.1 basic use
Filter grouping: HAVING clause
Rows have been grouped.
Aggregate functions are used.
Groups that meet the conditions in the HAVING clause are displayed.
HAVING cannot be used alone and must be used with GROUP BY.
SELECT department_id, MAX (salary) FROM employeesGROUP BY department_idHAVING MAX (salary) > 10000
Illegal use of aggregate functions: aggregate functions cannot be used instead of filter conditions in the WHERE clause. As follows:
SELECT department_id, AVG (salary) FROM employeesWHERE AVG (salary) > 8000GROUP BY department_id
Exercise: query department information with an id of 10, 20, 30, 40, which has a maximum wage higher than 10000 in these four departments
# Mode 1: recommended, the execution efficiency is higher than that of 2.SELECT department_id,MAX (salary) FROM employeesWHERE department_id IN (10dag20) GROUP BY department_idHAVING MAX (salary) > 10000 Bittern # Mode 2:SELECT department_id,MAX (salary) FROM employeesGROUP BY department_idHAVING MAX (salary) > 10000 AND department_id IN (10min20) 30pc40)
Conclusion:
When there is an aggregate function in the filter condition, the filter condition must be declared in HAVING.
When there is no aggregate function in the filter condition, the filter condition declaration can be in either WHERE or HAVING. However, it is recommended that you declare it in WHERE
3.2 comparison between WHERE and HAVING
1. In terms of scope of application, the scope of application of HAVING is wider. two。 If there is no aggregate function in the filter condition: in this case, WHERE is more efficient than HAVING
Distinguishing 1:WHERE can directly use the fields in the table as filtering criteria, but cannot use the calculated functions in the grouping as filtering criteria; HAVING must be used in conjunction with GROUP BY, and you can use grouped calculated functions and grouped fields as filtering criteria.
This determines that when grouping data is needed, HAVING can accomplish tasks that WHERE cannot. This is because, in the query syntax structure, WHERE comes before GROUP BY, so the grouping results cannot be filtered. After GROUP BY, HAVING can filter the grouped result set using grouping fields and calculation functions in the grouping, a function that WHERE cannot do. In addition, records excluded by WHERE are no longer included in the grouping.
Difference 2: if you need to get the required data from the associated table through a join, WHERE first filters and then joins, while HAVING first joins and then filters. This determines that WHERE is more efficient than HAVING in associative queries. Because WHERE can filter first and join with a filtered smaller dataset and associated table, it takes up less resources and is more efficient. On the other hand, HAVING needs to prepare the result set first, that is, to associate it with an unfiltered dataset, and then filter this large dataset, which takes up more resources and is less efficient.
The summary is as follows:
Advantages and disadvantages: WHERE (filter before grouping) filter data first and then correlate it with high execution efficiency. HAVING (filter after grouping) can be filtered in the final result set using calculation functions in grouping, and the execution efficiency is low.
Choices in development:
WHERE and HAVING are not mutually exclusive, we can use both WHERE and HAVING in a query. The condition containing the grouping statistical function is HAVING, and the general condition is WHERE. In this way, we not only take advantage of the efficiency and speed of WHERE conditions, but also take advantage of the advantages that HAVING can use query conditions including grouped statistical functions. When the amount of data is very large, the operating efficiency will be very different. Generally speaking, if you can use pre-grouping screening, try to use pre-grouping screening to improve efficiency.
4. Review: paging query ★
Application scenario: when the data to be displayed is incomplete, the sql request needs to be submitted in pages.
Syntax:
Select query list from Table [join type join Table 2 on join conditions where filter criteria group by grouping Fields filtered order by sorted Fields] limit [offset,] size; offset to display the starting index of the entry (starting index starts at 0) the number of entries to be displayed by size
Features:
The limit statement is placed at the end of the query statement
Formula
Select query list from table limit (page-1) * size,size
Suppose size=10, that is, 10 records per page, and page starts at 1, that is, the first page
Page=1, the starting index of the entry is 0, and the page displays 0-10 entries
Page=2, the starting index of the entry is 10, and the page displays 11-20 entries
Page=3, the starting index of the entry is 20, and the page displays 21-30 entries
Case 1: query the first five items of employee information
SELECT * FROM employees LIMIT 0pc5ntSelect * FROM employees LIMIT 5
Case 2: inquiry article 11-article 25
SELECT * FROM employees LIMIT 10 and 15
Case 3: information about employees with bonuses, and the top 10 with higher salaries are displayed.
SELECT * FROM employees WHERE commission_pct IS NOT NULL ORDER BY salary DESCLIMIT 10
5. The implementation process of SELECT
5.1 complete structure of the SELECT statement
# Mode 1:sql92 syntax: SELECT,....,... FROM...,... WHERE multi-table join condition AND does not contain group function filter condition GROUP BY...,... HAVING contains group function filter condition ORDER BY. ASC/DESCLIMIT...,... # way 2:sql99 syntax SELECT...,... FROM... JOIN... The join condition of ON multi-table JOIN... ON... WHERE does not contain the filter condition of group function AND/OR does not contain the filter condition of group function GROUP BY,... HAVING contains the filter condition of group function ORDER BY. ASC/DESCLIMIT..., # where: # (1) from: which tables to filter # (2) on: remove Cartesian product # (3) where: criteria for filtering from tables # (4) group by: grouping by # (5) having: filter again in statistical results # (6) order by: sort # (7) limit: paging
5.2 SELECT execution order
You need to remember the two order of SELECT queries:
1. The order of keywords cannot be reversed:
SELECT... FROM... WHERE... GROUP BY... HAVING... ORDER BY... LIMIT...
The order in which the 2.SELECT statements are executed (in MySQL and Oracle, the SELECT execution order is basically the same):
FROM-> WHERE-> GROUP BY-> HAVING-> SELECT Fields-> DISTINCT-> ORDER BY-> LIMIT
For example, if you write a SQL statement, its keyword order and execution order are as follows:
SELECT DISTINCT player_id, player_name, count (*) as num # order 5FROM player JOIN team ON player.team_id = team.team_id # order 1WHERE height > 1.80 # order 2GROUP BY player.team_id # order 3HAVING num > 2 # order 4ORDER BY num DESC # order 6LIMIT 2 # order 7
When the SELECT statement performs these steps, each step produces a virtual table, which is then passed into the next step as input. It is important to note that these steps are implicit in the execution of the SQL and are invisible to us.
We can also see from the execution order here that because where is filtered first, group by statements are grouped in advance, so there is less data to participate in the grouping, so the execution efficiency is higher.
5.3 implementation principle of SQL
SELECT performs the FROM step first. At this stage, if you check multiple tables, you will also go through the following steps:
First, the Cartesian product is calculated by CROSS JOIN, which is equivalent to getting the virtual table vt (virtual table) 1-1.
Filter through ON, filter on the basis of virtual table vt1-1, get virtual table vt1-2.
Add an external line. If we use left join, right link or full join, we will involve external rows, that is, add external rows to the virtual table vt1-2 to get the virtual table vt1-3.
Of course, if we are working on more than two tables, we will repeat the above steps until all the tables have been processed. This process is our raw data.
When we get the original data of the query data table, that is, the final virtual table vt1, we can proceed to the WHERE phase on this basis. At this stage, the virtual table vt2 is obtained by filtering based on the results of the vt1 table.
Then move on to steps 3 and 4, namely the GROUP and HAVING phases. In this stage, the virtual tables vt3 and vt4 are actually obtained by grouping and packet filtering on the basis of the virtual table vt2.
When we have completed the conditional filtering section, we can filter the fields extracted from the table, that is, enter the SELECT and DISTINCT phases.
First, the desired fields are extracted in the SELECT phase, and then the duplicate rows are filtered out in the DISTINCT phase to get the intermediate virtual tables vt5-1 and vt5-2, respectively.
When we have extracted the desired field data, we can sort by the specified field, that is, the ORDER BY stage, to get the virtual table vt6.
Finally, on the basis of vt6, take out the record of the specified row, that is, the LIMIT stage, and get the final result, corresponding to the virtual table vt7.
Of course, when we write SELECT statements, there may not be all the keywords, and the corresponding stages will be omitted.
At the same time, because SQL is a structured query language similar to English, we should also pay attention to the corresponding keyword order when writing SELECT statements. * the so-called principle of underlying operation is the execution order we just mentioned. * * refer to the architecture of the following advanced articles for more detailed content.
6. After-class exercises
Comprehensive exercise 1
Can 1.where clauses be filtered using group functions? No
two。 Query the maximum, minimum, average, and sum of the company's employees' wages
SELECT MAX (salary), MIN (salary), AVG (salary), SUM (salary) FROM employees
3. Query the maximum, minimum, average, and sum of employees' wages in each job_id
SELECT job_id, MAX (salary), MIN (salary), AVG (salary), SUM (salary) FROM employeesGROUP BY job_id
4. Select the number of employees with each job_id
SELECT job_id, COUNT (*) FROM employeesGROUP BY job_id
5. Query the gap between the maximum wage and the minimum wage (DIFFERENCE)
SELECT MAX (salary), MIN (salary), MAX (salary)-MIN (salary) DIFFERENCEFROM employees
6. Inquire about the minimum wage of the employees under each manager, of which the minimum wage cannot be less than 6000. Employees without managers are not counted.
SELECT manager_id, MIN (salary) FROM employeesWHERE manager_id IS NOT NULLGROUP BY manager_idHAVING MIN (salary) > 6000
7. Check the name of all departments, location_id, number of employees and average salary, and descend by average wage
SELECT department_name, location_id, COUNT (employee_id), AVG (salary) avg_salFROM employees e RIGHT JOIN departments dON e.`department _ id` = d.`department _ id`group BY department_name, location_idORDER BY avg_sal DESC
8. Query the department name, job name and minimum wage of each job and each department
SELECT department_name,job_id,MIN (salary) FROM departments d LEFT JOIN employees eON e.`department _ id` = d.`department _ id`group BY department_name,job_id
Comprehensive exercise 2
1. Simple grouping
Case 1: query the average salary of employees for each type of job
SELECT AVG (salary), job_idFROM employeesGROUP BY job_id
Case 2: query the number of departments in each location
SELECT COUNT (*), location_idFROM departmentsGROUP BY location_id
two。 The filtering before grouping can be realized.
Case 1: query the maximum wage for each department whose mailbox contains the a character
SELECT MAX (salary), department_idFROM employeesWHERE email LIKE'% a%'GROUP BY department_id
Case 2: inquire about the average salary of each leader with bonus
SELECT AVG (salary), manager_idFROM employeesWHERE commission_pct IS NOT NULLGROUP BY manager_id
3. Filter after grouping
Case 1: query the number of employees in which department > 5
# ① queries the number of employees in each department SELECT COUNT (*), department_idFROM employeesGROUP BY department_id;# ② filters the ① results just now, SELECT COUNT (*), department_idFROM employeesGROUP BY department_idHAVING COUNT (*) > 5
Case 2: maximum wage for employees with bonus for each job > 12000 of the job number and maximum wage
SELECT job_id,MAX (salary) FROM employeesWHERE commission_pct IS NOT NULLGROUP BY job_idHAVING MAX (salary) > 12000
Case 3: each leader whose leadership number > 102has a minimum wage greater than 5000.
SELECT manager_id,MIN (salary) FROM employeesGROUP BY manager_idWhere manager_id > 102HAVING MIN (salary) > 5000
4. Add sort
Example: the maximum wage of employees with bonus for each type of work > 6000 and the maximum wage, in ascending order.
SELECT job_id,MAX (salary) mFROM employeesWHERE commission_pct IS NOT NULLGROUP BY job_idHAVING m > 6000ORDER BY m
5. Group by multiple field
Case study: query the minimum wage for each job and department, and descend according to the minimum wage.
SELECT MIN (salary), job_id,department_idFROM employeesGROUP BY department_id,job_idORDER BY MIN (salary) DESC; "how to use aggregate functions in MySQL" is introduced here. Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.