In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
Today, I would like to share with you how to achieve multi-table query in the MySQL database query related knowledge points, detailed content, clear logic, I believe that most people still know too much about this knowledge, so share this article for your reference, I hope you can get something after reading this article, let's take a look at it.
1. Multi-table query
A multi-table query, also known as an associative query, means that two or more tables complete the query operation together.
Prerequisite: these tables queried together are related (one-to-one, one-to-many), and there must be a join field between them, which may or may not have a foreign key. For example, the employee table and the department table, which depend on the department number for association.
1. Lead out
If we want to check the names of employees and departments now,
These two fields are in different tables. If there are no associated conditions, what will the query result be like? let's take a look.
SELECT last_name, department_nameFROM employees, departments +-+-- + | last_name | department_name | +-+-+ | King | Administration | | King | Marketing | | King | Purchasing | | King | Human Resources | | King | Shipping | | King | IT | | King | Public Relations | | King | Sales | | King | Executive | | King | Finance | | King | Accounting | | King | Treasury |. | Gietz | IT Support | Gietz | NOC | | | Gietz | IT Helpdesk | | Gietz | Government Sales | | Gietz | Retail Sales | | Gietz | Recruiting | | Gietz | Payroll | +-+-- + 2889 rows in set (0.01sec) SELECT COUNT (employee_id) FROM employees # output 107 lines SELECT COUNT (department_id) FROM departments;# output 27 lines SELECT 107 lines 27 FROM dual;107*27=2889
Obviously, the above operation is wrong.
The above action will cause a record in the employee table to match every record in the department table, as if an employee had worked in all departments. From a practical point of view, it is obvious that this will not happen.
This phenomenon is the Cartesian product.
two。 Cartesian product
Cartesian product is a concept in relational algebra, which represents the result of any combination of each row of data in two tables. For example, if there are two tables, the left table has m data records and x fields, and the right table has n data records and y fields, then the cross-join will return m data records and x fields. The Cartesian product intention is shown in the figure.
In SQL92, Cartesian product is also called cross join, which is called CROSS JOIN in English. CROSS JOIN is also used in SQL99 to represent cross connections. Its function is to join any table, even if the two tables are not related. In MySQL, Cartesian product occurs in the following cases:
Query employee name and department name
SELECT last_name,department_name FROM employees,departments;SELECT last_name,department_name FROM employees CROSS JOIN departments;SELECT last_name,department_name FROM employees INNER JOIN departments;SELECT last_name,department_name FROM employees JOIN departments;3. The solution of Cartesian product
Errors in Cartesian product will occur under the following conditions:
Errors in Cartesian product will occur under the following conditions:
Omit join conditions (or association conditions) for multiple tables
Invalid connection condition (or association condition)
All rows in all tables are connected to each other
To avoid Cartesian product, you can add valid join conditions to WHERE.
SELECT table1.column, table2.columnFROM table1, table2WHERE table1.column1 = table2.column2; # connection condition # case: query employee's name and department name SELECT last_name, department_nameFROM employees, departmentsWHERE employees.department_id = departments.department_id
Note: if there are the same fields in different tables, we need to declare which table we are looking for. Field name this and Java, class name. Attributes are similar and easy to understand.
SELECT employees.last_name, departments.department_name,employees.department_idFROM employees, departmentsWHERE employees.department_id = departments.department_id; II, multi-table query classification 1. Equivalent connection and non-equivalent connection
The equivalent connection is actually easy to understand, which means who is equal to whom, using =.
Non-equivalent connection, such as querying the record of a certain field > a certain value, etc.
SELECT employees.employee_id, employees.last_name, employees.department_id, departments.department_id, departments.location_idFROM employees, departmentsWHERE employees.department_id = departments.department_id
Expand:
Use aliases to simplify queries. -some field names are too long to use table name prefixes before column names to improve query efficiency. SELECT e.employee_id, e.last_name, e.departmentaccounid.departmentprincipid, d.location_idFROM employees e, departments dWHERE e.department_id = d.department_id
It should be noted that if we use the alias of the table, we can only use the alias instead of the original table name in the query field and filter conditions, otherwise an error will be reported.
two。 Self-connecting and non-self-connecting
Self-connection, it literally means to connect with yourself.
For example, now there is a table, and we want to find employee information and corresponding superior information.
We know that there is only one table that cannot be linked, and if we want to associate them, there must be conditions for them to be related, so there should be two tables. At this time, we can extract a table, which is essentially the same as the original table, and then we alias the table. Table1 and table2 are essentially the same table. Just use aliases to virtual into two tables to represent different meanings. Then the two tables are queried by inner join, outer join and so on.
For example, now that we want to find the names of employees and corresponding bosses, we can use self-connection
SELECT CONCAT (worker.last_name, 'works for', manager.last_name) FROM employees worker, employees managerWHERE worker.manager_id = manager.employee_id
Exercise: find out the manager information of an employee whose last_name is' Chen'.
Subquery returns more than 1 rowselect * from emp where deptno = 20 and job = any (select job from emp where deptno = 10) [2] query the number, name, and salary of an employee whose salary is higher than all "SALESMAN". -- query the employee's number, name and salary select empno,ename Sal from emp+-+ | empno | ename | sal | 7369 | SMITH | 800.00 | 7499 | ALLEN | 1600.00 | 7521 | WARD | 1250.00 | 7566 | JONES | 2975.00 | | 7654 | MARTIN | 1250.00 | 7698 | BLAKE | 2850.00 | 7782 | CLARK | 2450. 00 | | 7788 | SCOTT | 3000.00 | | 7839 | KING | 5000.00 | | 7844 | TURNER | 1500.00 | 7876 | ADAMS | 1100.00 | 7900 | JAMES | 950.00 | | 7902 | FORD | 3000.00 | 7934 | MILLER | 1300.00 | +-+ 14 rows in set (0.00 sec)-- salary for "SALESMAN": select sal from emp where job = 'SALESMAN' +-+ | sal | +-+ | 1600.00 | | 1250.00 | | 1250.00 | | 1500.00 | +-+ 4 rows in set (1250.00 sec)-query the number, name and salary of employees whose wages are higher than all "SALESMAN". -- multiple subqueries: select empno,ename,sal from emp where sal > all (select sal from emp where job = 'SALESMAN') +-+ | empno | ename | sal | 7566 | JONES | 2975.00 | 7698 | BLAKE | 2850.00 | 7782 | CLARK | 2450.00 | 7788 | SCOTT | 3000.00 | 7839 | KING | 5000.00 | 7902 | FORD | 3000.00 | + -+-+ 6 rows in set (0.00 sec) 2. Related subquery
[1] unrelated subqueries introduce:
Unrelated subqueries: subqueries can be run independently, running subqueries first and then external queries.
Related subqueries: subqueries cannot be run independently, and external queries are run first, and then subqueries are run
[2] advantages and disadvantages of unrelated subqueries:
Benefits: simple and powerful (some unrelated subqueries cannot be implemented or tedious subqueries can be implemented using related subqueries)
Disadvantages: a little difficult to understand
[3] sql display:
-- [1] query employees of the highest wage (irrelevant subquery) select * from emp where sal = (select max (sal) from emp)-- [2] query employees of the department's highest wage (related subquery)-- method 1: through unrelated subquery: select * from emp where deptno = 10 and sal = (select max (sal) from emp where deptno = 10) unionselect * from emp where deptno = 20 and sal = (select max (sal) from emp where deptno = 20) unionselect * from emp where deptno = 30 and sal = (select max (sal) from emp where deptno = 30)-- disadvantage: more statements Exactly how many parts are unknown-method 2: related subquery select * from emp e where sal = (select max (sal) from emp where deptno = e.deptno) order by deptno-- [3] query those employees whose salary is higher than the average salary of their position (related subquery)-- irrelevant subquery: select * from emp where job = 'CLERK' and sal > = (select avg (sal) from emp where job = 'CLERK') union.-- related subquery: select * from emp e where sal > = (select avg (sal) from emp e2 where e2.job = e.job) 4. Aggregate function 1. Introduction to aggregate function
The aggregate function acts on a set of data and returns a value for a set of data.
Aggregate function type
AVG ()
SUM ()
MAX ()
MIN ()
COUNT ()
Grammar
Note: aggregate functions are not allowed to be nested
1.1 AVG and SUM functions
You can use the AVG and SUM functions for numeric data.
When they calculate the null value, they will include the non-null value, and then automatically ignore the null value.
AVG=SUM/COUNT
Mysql > select * from emp +-+ | EMPNO | ENAME | JOB | MGR | HIREDATE | SAL | COMM | DEPTNO | +- -+-+ | 7369 | SMITH | CLERK | 7902 | 1980-12-17 | 800.00 | NULL | 20 | 7499 | ALLEN | SALESMAN | 7698 | 1981-02-20 | 1600.00 | 300.00 | 30 | 7521 | WARD | SALESMAN | 7698 | 1981-02-22 | 1250.00 | 500.00 | 30 | 7566 | JONES | MANAGER | 7839 | 1981-04-02 | 2975.00 | NULL | 20 | 7654 | MARTIN | SALESMAN | 7698 | 1981-09-28 | 1250.00 | 1400.00 | 30 | 7698 | BLAKE | MANAGER | 7839 | 05-01 | 2850.00 | NULL | 30 | 7782 | CLARK | MANAGER | 2450.00 | NULL | | 10 | | 7788 | SCOTT | ANALYST | 7566 | 1987-04-19 | 3000.00 | NULL | 20 | 7839 | KING | PRESIDENT | NULL | 1981-11-17 | 5000.00 | NULL | 10 | 7844 | TURNER | 7698 | 1981-09-08 | 1500.00 | 1500.00 | 30 | 7876 | ADAMS | CLERK | 7788 | 1987-05-23 | 1100.00 | NULL | 20 | 7900 | JAMES | CLERK | 7698 | 1981-12-03 | 950.00 | NULL | 30 | 7902 | FORD | ANALYST | 7566 | 1981-12-03 | 3000.00 | NULL | 20 | 7934 | MILLER | CLERK | 7782 | 1982-01-23 | 1300.00 | NULL | 10 | +-+ -+ 14 rows in set (0.00 sec)
1.2 MIN and MAX functions
You can use the MIN and MAX functions for data of any data type.
1.3The COUNT function
COUNT (*) returns the total number of records in the table, which applies to any data type.
Mysql > select count (*) from emp;+-+ | count (*) | +-+ | 14 | +-+ 1 row in set (0.01sec)
Calculate the number that appears in the query result of the specified field
Mysql > select count (comm) from emp;+-+ | count (comm) | +-+ | 4 | +-+ 1 row in set (0.00 sec)
COUNT (expr) returns the total number of records whose expr is not empty.
Question: who should use count (*), count (1) or count (listed)?
In fact, there is no difference between the tables of the MyISAM engine. There is a counter inside the engine that maintains the number of rows.
The table of the Innodb engine uses count (*), count (1) to read lines directly, and the complexity is O (n), because innodb really has to count it. But better than the specific count (column name).
Question: can I replace count (*) with count (column name)?
Do not use count (column name) instead of count (*). Count (*) is the standard syntax for counting rows defined by SQL92, independent of database, NULL and non-NULL.
Note: count (*) will count rows with a value of NULL, while count (column name) will not count rows with NULL values for this column.
If you talk like this, you may still be confused. Next, let me demonstrate it.
2.group by
Group by can be used for grouping. We used to use avg to calculate the average salary of all employees, but if we want to ask for the average salary of employees in each department, we have to divide the departments into groups, divide them by departments, and then work out their average wages.
Note: a field cannot be used with a multiline function because the number of records does not match, which will cause the query data not to be fully displayed. However, it is OK if this field belongs to a grouping.
Mysql > select deptno,avg (sal) from emp group by deptno +-+-+ | deptno | avg (sal) | +-+-+ | 20 | 2175.000000 | 30 | 1566.666667 | 10 | 2916.666667 | +-+-+ 3 rows in set (0.00 sec) Statistics the average salary of each position mysql > select job Avg (sal) from emp group by job +-+-+ | job | avg (sal) | +-+-+ | CLERK | 1037.500000 | SALESMAN | 1400.000000 | MANAGER | 2758.333333 | ANALYST | 3000.000000 | | PRESIDENT | 5000.000000 | +-- -+ 5 rows in set (0.00 sec) 3. Filtering after grouping using having
Conditions for using having:
Line 1 has been grouped.
two。 Aggregate functions are used.
3. Groups that meet the conditions in the HAVING clause are displayed.
4. HAVING cannot be used alone, it must be used with GROUP BY.
Statistics of the average wages of various departments, only show that the average salary of more than 2000-group after the second screening having
Mysql > select deptno,avg (sal) from emp-> group by deptno-> having avg (sal) > 2000 +-+-+ | deptno | avg (sal) | +-+-+ | 20 | 2175.000000 | | 10 | 2916.666667 | +-+-+ 2 rows in set (0.01sec) 5. Comparison between where and having
Distinguishing 1:WHERE can directly use the fields in the table as filtering criteria, but cannot use the calculated functions in the grouping as filtering criteria; HAVING must be used in conjunction with GROUP BY, and you can use grouped calculated functions and grouped fields as filtering criteria.
This determines that when grouping data is needed, HAVING can accomplish tasks that WHERE cannot. This is because, in the query syntax structure, WHERE comes before GROUP BY, so the grouping results cannot be filtered. After GROUP BY, HAVING can filter the grouped result set using grouping fields and calculation functions in the grouping, a function that WHERE cannot do. In addition, records excluded by WHERE are no longer included in the grouping.
Difference 2: if you need to get the required data from the associated table through a join, WHERE first filters and then joins, while HAVING first joins and then filters. This determines that WHERE is more efficient than HAVING in associative queries. Because WHERE can filter first and join with a filtered smaller dataset and associated table, it takes up less resources and is more efficient. On the other hand, HAVING needs to prepare the result set first, that is, to associate it with an unfiltered dataset, and then filter this large dataset, which takes up more resources and is less efficient.
The summary is as follows:
Choices in development:
WHERE and HAVING are not mutually exclusive, we can use both WHERE and HAVING in a query. The condition containing the grouping statistical function is HAVING, and the general condition is WHERE. In this way, we not only take advantage of the efficiency and speed of WHERE conditions, but also take advantage of the advantages that HAVING can use query conditions including grouped statistical functions. When the amount of data is very large, the operating efficiency will be very different.
6. The implementation process of select 1. Keyword order
SELECT... FROM... WHERE... GROUP BY... HAVING... ORDER BY... LIMIT...
The order in which 2.SELECT statements are executed
FROM-> WHERE-> GROUP BY-> HAVING-> SELECT Fields-> DISTINCT-> ORDER BY-> LIMIT
For example, if you write a SQL statement, its keyword order and execution order are as follows:
SELECT DISTINCT player_id, player_name, count (*) as num order 5FROM player JOIN team ON player.team_id = team.team_id order 1WHERE height > 1.80Sequential 2GROUP BY player.team_id order 3HAVING num > 2 Sequential 4ORDER BY num DESC Sequential 6LIMIT 2 Sequential 73.SQL implementation principle (understand first)
SELECT performs the FROM step first. At this stage, if you check multiple tables, you will also go through the following steps:
First, the Cartesian product is calculated by CROSS JOIN, which is equivalent to getting the virtual table vt (virtual table) 1-1.
Filter through ON, filter on the basis of virtual table vt1-1, get virtual table vt1-2.
Add an external line. If we use left join, right link or full join, we will involve external rows, that is, add external rows to the virtual table vt1-2 to get the virtual table vt1-3.
Of course, if we are working on more than two tables, we will repeat the above steps until all the tables have been processed. This process is our raw data.
When we get the original data of the query data table, that is, the final virtual table vt1, we can proceed to the WHERE phase on this basis. At this stage, the virtual table vt2 is obtained by filtering based on the results of the vt1 table.
Then move on to steps 3 and 4, namely the GROUP and HAVING phases. In this stage, the virtual tables vt3 and vt4 are actually obtained by grouping and packet filtering on the basis of the virtual table vt2.
When we have completed the conditional filtering section, we can filter the fields extracted from the table, that is, enter the SELECT and DISTINCT phases.
First, the desired fields are extracted in the SELECT phase, and then the duplicate rows are filtered out in the DISTINCT phase to get the intermediate virtual tables vt5-1 and vt5-2, respectively.
When we have extracted the desired field data, we can sort by the specified field, that is, the ORDER BY stage, to get the virtual table vt6.
Finally, on the basis of vt6, take out the record of the specified row, that is, the LIMIT stage, and get the final result, corresponding to the virtual table vt7.
Of course, when we write SELECT statements, there may not be all the keywords, and the corresponding stages will be omitted.
At the same time, because SQL is a structured query language similar to English, we should also pay attention to the corresponding keyword order when writing SELECT statements. The so-called underlying operation principle is the execution order we just talked about.
These are all the contents of the article "how to implement multi-table query in MySQL database query". Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.