How to improve efficiency and accuracy through well-formed SQL 04/19 Update SLTechnology News&Howtos

How to improve efficiency and accuracy through well-formed SQL

2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "how to improve efficiency and accuracy through well-formed SQL". The content is simple and easy to understand, and the organization is clear. I hope it can help you solve your doubts. Let Xiaobian lead you to study and learn "how to improve efficiency and accuracy through well-formed SQL".

background

Well-formed SQL doesn't work any better than messy SQL. The database doesn't really care whether you put commas before or after field names in SQL statements. For your own clarity and to be an effective SQL writer, I recommend following these formatting rules. In this article I'll share how to improve productivity with well-formed SQL statements. I define efficiency as being able to output accurate results from SQL and having code that is clear and easy to understand, modify, and debug. I list only SELECT statements because 99% of the SQL statements I write are query statements. Formatting SQL code is a very personal matter, and I know that it varies from person to person, and developers think their formatting rules are the most reasonable.

sample questions

The following is a typical SQL application scenario. The data of the business report comes from three tables: customer table, sales table and geographical table. Based on January 2015 data, this report needs to show the total number of customers and sales in each administrative region. This requirement can be realized by a simple SQL statement, which needs to query three tables in association.

Possible problems with data

While SQL is simple, making sure your results are correct is still really key, because there are a few reasons why errors can occur:

The data may come from different sources. This means that you cannot guarantee the integrity of these tables. For example, you cannot assume that all zip codes in the customer table are valid zip codes and must exist in the region table.

Applications that enter customer table data may capture unvalidated location data, which may include incorrect zip codes.

The zip code list may not be complete. Newly published zip codes may not be imported into the table in time for publication.

*** Principles

For me, getting the right results from SQL is more important than writing SQL that is legible. The *** thing I'm going to do is write the following SQL statement to get the total number of customers. I'll adjust it after I write the whole sentence.

The *** I wrote this sentence:

SELECTCOUNT(DISTINCT cust_id) as count_customersFROMcustomers Result: count_customers "10"

This query is important because it revolves around the *** principle. Since there are no SQL management queries, there are no dependencies, and I know this is the correct result for customer numbers. I write this down because I always need this number to measure the SQL that follows, and I will mention it many times later in this article.

The next step is to add the necessary fields and tables to complete the query. I highlight the word "add" because according to my rules, I comment out queries that get the same result when applying the *** principle. Here is my final formatted query.

Format SQL

Here is the recommended formatted SQL based on my formatting ideas.

SELECT 0 ,c.cust_post_code ,p.location ,COUNT(DISTINCT c.cust_id) number_customers ,SUM(s.total_amount) as total_sales FROM customers c JOIN post_codes p ON c.cust_post_code = p.post_code JOIN sales s ON c.cust_id = s.cust_id WHERE 1=1 AND s.sales_date BETWEEN ‘2015-01-01’ AND ‘2015-01-31’ -AND s.order_id = 5 GROUP BY c.cust_post_code ,p.location

Always use table aliases

Time will prove necessary. If you don't use aliases for every field used in your SQL statement, you may add other fields with the same name to the query statement at some point in the future. At that point your query and even your report will generate errors (duplicate field names).

Comma before field

When debugging or testing my query statements, this makes it easy to comment out a field without modifying other lines, and all commas are missing or redundant. Otherwise, you might have to adjust commas to make sure the statement is correct. If you debug statements frequently, this can be extremely convenient and efficient. The same applies to the SELECT section and the GROUP BY clause section.

I used SELECT 0 as the start of the statement when I was developing, and it was easy to remove when migrating to a formal environment. This way we can write everything in front of all the subsequent fields. Without this "0," if I wanted to comment out *** fields (in this case,"c. dust_post_code"), I would have to deal with the comma problem that follows. I have to comment it out temporarily and add it back later. The same is true in the "GROUP BY" statement. This "0" is extra.

Put "JOIN" on a separate line

Putting the "JOIN" statement on a separate line has the following benefits:

Doing so makes it easy to see all the tables involved in this query statement, just scroll through the "JOIN" statement.

JOIN allows you to put all logical relationships in one place, rather than listing all table and expression relationships in a WHERE clause. We can't always put "JOIN" statements on one line, but they should at least be put together.

It is also relatively easy to comment out the "JOIN" statement when doing so. This is useful when debugging, and you may want to know if "JOIN" is causing the data problem.

Column mode editing

Column mode editing is convenient when dealing with a large number of fields. Below are *** animated GIF displays I've done, where you can comment out all non-aggregated fields. I used column mode editing instead of just commenting out fields:

Create all indexes

When using a UNION statement with more fields:

Comment out the list of fields in the GROUP BY clause

Test query results

I have to list all customers using the outer link "OUTER" because not all customers have zip codes in the geography table. I can make sure that the results of my query are the same as those of the initial query (the statement that queried the customer separately) by including and excluding different fields and tables repeatedly, which is actually obeying the *** principle.

SELECT0,c.cust_post_code-,p.location,COUNT(DISTINCT c.cust_id) number_customers,SUM(s.total_amount) as total_salesFROMcustomers c-LEFT OUTER JOIN post_codes p ON c.cust_post_code = p.post_codeJOIN sales s ON c.cust_id = s.cust_idWHERE1=1AND s.sales_date BETWEEN ‘2015-01-01’ AND ‘2015-01-31’-AND c.cust_post_code = 2000-AND p.post_code = 200GROUP BYc.cust_post_code-,p.location

SQL like this means to me that I have to write separate tests to check the data. By commenting out those few lines I can verify the accuracy of my query data using the *** principle. Doing so improves my efficiency and accuracy of reporting.

That's all for "How to improve efficiency and accuracy with well-formed SQL". Thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.