Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the misunderstandings in database design?

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what are the misunderstandings of database design". In daily operation, I believe that many people have doubts about the misunderstandings of database design. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts about "what are the misunderstandings of database design?" Next, please follow the editor to study!

One of the misunderstandings: alternate field

Phenomenon description:

In the data table, not only the fields currently needed are designed, but also several fields are set aside as backup.

For example, I have designed a personnel table (Person) that has added various necessary fields, including name (Name), gender (Sex), date of birth (birthday), and so on. After the completion of the work, it suddenly occurred to me that there should be a lot of other people-related content in the system in the future, such as graduating colleges and universities, such as work units, etc., although there is no need to fill in it now, it may still be used in the future. For a slap on the head, add five varchar 2 fields, called Text1, Text2. Text5, and then thought that there should be some date fields to spare, so I set up three date fields, named date1, date2, date3, …

Cause analysis:

You should have seen the problem, there are a large number of temporarily useless fields in this data table, we can call them alternate fields, what is their function? Just in case, to guard against possible situations.

This seems to be called prevention, when needed, there is no need to add new fields to the table, and in doing so, the data of a table should be stored in adjacent physical space, which is also good for performance.

Another reason is that in the old database, if you change the definition of the database (including adding fields, changing the type of fields, deleting fields, etc.), then all the data will be lost, so this work is very troublesome. We need to first create a temporary table, back up the data, then create a new table, import the data into it, and then delete the original table.

The problem is:

This approach will lead to a lot of problems for the project, and the problems you originally wanted to solve may not be solved. If you don't believe me, please read on.

Problem 1: adding a large number of spare fields is bound to waste a lot of space, although there may be no specific data, but only empty fields will occupy a certain amount of space.

Question 2: due to the characteristics of naming, if there is no sound document management process, it will not be long (maybe two or three years), no one will be able to say exactly which field represents what meaning. Even with document management, these management tasks can be troublesome, and you need to apply for them every time you use them, and there may be conflicts.

Question 3: will it really be enough to add these alternate fields? Not necessarily, because we just set aside a few spare fields for each type, and if the number exceeds, or when we want to use special, uncommonly used types, we still need to add new fields. For example, in the Person table above, if we want to store photos, we may have to add a photo field of type blob, which may not necessarily leave such an alternate field in the initial design. And without good management, who can tell which field has been used and which field can still be used? It's not time to add new fields.

Solution:

In fact, the above design is a kind of "over-design". What we should do is "design on demand". After detailed and effective analysis, only the necessary fields are placed in the data table, and do not leave a large number of spare fields.

When there is a need to add relevant information, it is necessary to analyze the situation:

If the quantity is small and the nature of the information is closely related to the original table, you can directly add fields to the original table and update the relevant data.

If the number is large, or if it is not a critical property of the original table object, you can add a new table and connect it by key value.

For the performance problems caused by the storage location of the table data, we can solve the performance problems caused by the data reorganization of the database at a specific time, and this work also needs to be carried out regularly for long-running databases.

The second misunderstanding is meaningful coding

Phenomenon description:

The ID that uses a meaningful code as a record, or even exists as the primary key of the database, for example, an employee's code is set to 0203004, where 02 represents the employee's branch, 03 represents the employee's department, and 004 represents the serial number of the employee entering the department.

Cause analysis:

ID may be set in the following ways, one is a pure serial number, starting with 1, adding 1 at a time, or it will be improved to convert the number into a string format, such as "0000001"; one is meaningless random coding, for example, GUID; also has a meaningful coding, a specific number of digits will represent a certain meaning.

I think the reason why people like to use this method so much is that they want to get some information from the code, and there are even special modules to parse the code in some programs. Just like our ID card number, you can know the location, birthday, gender and other information when you see the ID card number.

The problem is:

In fact, meaningful coding can lead to a lot of problems, please see:

Problem one: the waste of coding resources. If it is a pure serial number, then 1 to 10000 can represent 10, 000 records, but if a meaningful code is used, it is likely that 1000 records will make the five-digit code insufficient. I have encountered a real situation where the first digit of our company's application number is meaningful, representing the channel to which the application belongs, followed by a long string of numbers (9 digits). In theory, these codes will never be used up, but the first three channels used 1, 4 and 7 codes, but the implementation of a new insurance law led to the invalidation of the original insurance policy. So three more numbers 2, 5, 8 were used, and then the company changed its name, and the three channels scrapped the insurance policy and re-used the new opening digits, just in a few years. All the application numbers have been used up, but there are no more than 1 million printed out.

Problem 2: it is not necessarily unique and difficult to act as a primary key. Think about it. That's what our ID numbers look like. In the original 15 digits, the last three digits are serial numbers, while men will use odd numbers and women will use even numbers, that is to say, no more than 500 men and women can have the same birthday in an area, otherwise it will lead to the repetition of the number. although the probability of this phenomenon is relatively low, it still exists objectively.

Question 3: the meaning of the representative is not necessarily accurate. For example, using a meaningful code to define a job number for an employee may have the meaning of department, position, and so on, but if the employee has been transferred from department to department, or the rank has changed, do you need to change his code? Change it, then all the historical data will be modified once, and the workload will be very large; if you don't change it, then the meaning of the representative will no longer be accurate, and we will not be able to get accurate information about the employee from the code.

Solution:

Therefore, for coding, it is highly discouraged to use meaningful encoding, or to use pure serial numbers, but this may need to define a wide range of types, which may not be enough for large amounts of recorded data; in that case, you can use GUID, so that the coding will never be repeated, and there will be a lot of coding resources available.

At this point, the study of "what are the misunderstandings of database design" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report