Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the issues to pay attention to before writing the next SQL query

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article focuses on "what should be paid attention to before writing the next SQL query", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn what you need to pay attention to before writing the next SQL query.

When I was at Airbnb, I had a good opportunity to work on the new team that reported to Brian Chesky. It's exciting-we're playing a new product line, so we have to make decisions to change the game every day. But as a data scientist on the team, I am always responsible for purchasing data to guide our product direction, which means a lot of analytical work.

The first week was a tough test of my ability to switch contexts: I had to find obscure tables and write a lot of queries, but even request regular expressions through Beautifulsoup Scrapes and Veartrics API. By the third week, I began to get tired, so I needed a system to keep my speed. Let me implement that when using data, there are only two ways to screw it up:

Use the wrong data.

Incorrect use of data.

Both can be solved by having a better context around the data.

So I made my own list to mitigate these two mistakes to make sure I didn't lead the product to oblivion. I will share mine here, but it may depend on your company's specific chips for you. Use this as an example guide to learn how to give yourself some good context to make it dangerous for you to use forms, but I encourage you to take this and make it your own.

So what background do I need and how do I get it?

Well, you need any and all information that will reduce errors or use erroneous data. In my experience, it takes only three checks to get reasonable coverage:

Check the base table METAData.e.g. Column name, partition information, how to generate.

Check your assumptions. What is it in this column? Is this an empty column? What is a very different value? Have there been any changes since the last time I ran this query?

Get in touch with others. What are the others doing with this table? Do you have any questions about who you ask?

1. Check basic table metadata

The first step is for it to find a table and figure out how to query it.

You must woo your table before it will reveal its secrets (Reposted with permission from Olya Tanner)

For the most basic information, such as column names, index information, partition information, view definitions, you can usually query system tables. Keep a list of these tables on hand so that you can easily query them. For example, for ANSI SQL-compliant databases (most of them), keep in mind that the following table is usually helpful:

Information_schema.columnScolumn name, partition information, column type, invalid.

Information_schema.tables and Information_schema.viewSnice list all tables and views. View, you can usually get the DDL statement.

You can usually also get query history written by others, which can help you figure out how to use the table. You can even filter by statement type (for example, create, insert, select) to determine how to create the table:

Information_schema.jobs_by_project (bigquery) table (Information_schema.query_history ())

two。 Check your assumptions.

Make a note of your assumptions and run the query to check them.

A nice illustration of a person making a checklist, in case you haven't seen one before.

At this point you want to see if the data is what you think it is. Although my typical approach is to walk through the choice * and choose different statements, this is the second best. A better way is to find out first:

What questions do I need to answer, and what assumptions do I make?

Write these down, and then write down the queries that answer these questions / verify these hypotheses. It may sound simple, but if you make the wrong assumptions, you have to start over. We all make assumptions when using data-if you don't know them, this is a recipe for disaster.

Some examples of recent projects:

Is there only one line for each event?

What is the possible value of this field?

Is this column innumerable?

If null, is there any system mode for these null values?

I personally use whales (CLI tools, if I get impatient) or run dataframe (or even plan) these quick checks, but no matter what you use, just make sure they remain persistent.

Finally, yes, it's fine-keep making choices *. Sometimes you only need to look at a piece of data.

3. Contact with other people

Now that you have a sense of the form of data, just sneak in and build what you need to build. Don't. You need to acquire as much social environment and tribal knowledge as possible, especially in large organizations.

Now is the time to gather tribal knowledge.

I know these people don't have faces, but after getting some other social environment, don't you just make the guy on the right happy?

Unfortunately, there is only so much, and you can get it from dredging data alone. You need to talk to real people (or find some up-to-date files).

By querying the log (see above), through Github's Slog (if your query is version-controlled), or by checking who owns the table (you can usually do this in the data context / discovery tool, such as dataframe)-just find someone to slack off.

Generally speaking, I ask the following questions:

Is this maintenance?

Is this the best data for {{your scenario}}?

What you have done so far is open. You may be on the wrong list, but people appreciate some controversy.

At this point, I believe you have a deeper understanding of what you need to pay attention to before writing the next SQL query, so you might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report