What are the methods of database regularization and design 07/11 Update SLTechnology News&Howtos

What are the methods of database regularization and design

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "what are the methods of database regularization and design". In the operation of actual cases, many people will encounter such a dilemma. Next, let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

In the design of dynamic website, the importance of database design is self-evident. If it is not designed properly, it will be very difficult to query, and the performance of the program will be affected. Whether you are using mySQL or Oracle databases, formalized table design can make your PHP code more readable and extensible, thus improving the performance of your application.

To put it simply, regularization is to eliminate redundancy and uncoordinated dependencies in table design. In this article, I will go through five gradual processes to tell you the regularization techniques you should know in your design. In order to establish a feasible and efficient database. This article will also analyze the types of relationships that can be used in detail.

Let's assume that we want to create a table of user information that stores the user's name, company, company address, and some personal favorites or url. At the beginning, you may define a table structure as follows:

Zero state form

Users

Name company company_address url1 url2

Joe ABC 1 Work Lane abc.com xyz.com

Jill XYZ 1 Job Street abc.com xyz.com

Since there is no normalization, we call this form of table a zero-state table. Notice the url1 and url2 fields-what if we need a third url in the application? In this way, you have to add one more column to the table, which is obviously not a good idea. If you want to create an extensible system, you should consider using the first formalized form and apply it to the table.

First-level normalized form

1. Eliminate duplicate groups in each table

2. Create a separate table for each set of related data

3. Use a primary key to identify each set of related data

The above table clearly violates the first rule above, so what does the primary key of the third item mean? Quite simply, it simply adds a unique, automatically incremented integer value to each record. With this value, you can distinguish between two records with the same name. By applying the first-level normalization form, we get the following table:

Users

UserId name company company_address url

1 Joe ABC 1 Work Lane abc.com

1 Joe ABC 1 Work Lane xyz.com

2 Jill XYZ 1 Job Street abc.com

2 Jill XYZ 1 Job Street xyz.com

Now our table can be said to be in the form of the first level of normalization, it has solved the limitation of the url field, but this processing brings a new problem. Every time we insert a record into the user table, we have to repeat all the company and user data. This not only makes the database larger than before, but also easy to make mistakes. Therefore, it has to go through the second level of regularization.

Second-level normalized form

1. Create a separate table for fields applied to multiple records

2. Associate the values of these tables through a foreign key

We put the values of url in a separate table so that we can add more data later without having to worry about duplicating values. We also associate these fields with primary key values:

Users

UserId name company company_address

1 Joe ABC 1 Work Lane

2 Jill XYZ 1 Job Street

Urls

UrlId relUserId url

1 1 abc.com

2 1 xyz.com

3 2 abc.com

4 2 xyz.com

As shown above, we created a separate table, and the primary key userid in the users table is now associated with the foreign key relUserId in the url table. The situation seems to have improved significantly. But what if we want to add an employee record for ABC? Or more, 200? In this way, we have to reuse the company name and address, which is obviously not redundant. So we will apply the third-level normalization method:

The third-level normalized form

1. Eliminate fields that do not depend on the key

The company name and address have nothing to do with User Id, so their applications have their own company Id:

Users

UserId name relCompId

1 Joe 1

2 Jill 2

Companies

CompId company company_address

1 ABC 1 Work Lane

2 XYZ 1 Job Street

Urls

UrlId relUserId url

1 1 abc.com

2 1 xyz.com

3 2 abc.com

4 2 xyz.com

In this way, we associate the primary key comId in the companies table with the foreign key named relCompId in the users table, and even if we join 200 employees for ABC, there is only one record in companies. Our users and urls tables can be expanded without having to worry about inserting unnecessary data. Most developers believe that three steps of regularization is enough, and that the database is designed to easily handle the burden of the entire enterprise, which is true in most cases.

We can take a look at the fields in url-have you noticed the redundancy of the data? If the HTML page that inputs this url data to the user is a text box that can be entered at will, it is not a problem for two users to enter the same favorites, but if it is through a drop-down menu, let the user choose only two url inputs, or a little more. In this case, our database can also be optimized at the next level-the fourth step, which is ignored for most developers because it relies on a very special relationship-a many-to-many relationship that we have never encountered in our application.

Data relation

Before defining the form of the fourth normalization, I'd like to mention three basic data relationships: one-to-one, one-to-many, and many-to-many. Let's look back at the first normalized users table. If we put the fields of url in a separate table and insert one record at a time in the users table, we insert a row in the urls table. We will get an one-to-one relationship: each row in the user table will find the corresponding row in the urls table. For our application, this is neither practical nor standard.

Then look at the second example of regularization. For each user record, our table allows multiple urls records to be associated with it. This is an one-to-many relationship, which is a very common relationship.

For many-to-many relationships, it's a little complicated. In our third formalized example, one of our users is associated with a lot of url, and we want to change this structure to allow multiple users to be associated with multiple urls, so we can get a many-to-many structure. Before we discuss it, let's take a look at how the table structure will change.

Users

UserId name relCompId

1 Joe 1

2 Jill 2

Companies

CompId company company_address

1 ABC 1 Work Lane

2 XYZ 1 Job Street

Urls

UrlId url

1 abc.com

2 xyz.com

Url_relations

RelationId relatedUrlId relatedUserId

1 1 1

2 1 2

3 2 1

4 2 2

In order to further reduce the redundancy of data, we use the fourth-level normalization form. We created a rather strange url_relations table with fields that were either primary keys or foreign key. With this table, we can eliminate duplicate items in the urls table. The following are the specific requirements of the fourth formalized form:

The fourth formalized form

1. In a many-to-many relationship, independent entities cannot be stored in the same table

Because it only applies to many-to-many relationships, most developers can ignore this rule. In some cases, however, it is very useful, as in this example, we improved the urls table by separating the same entities and moving the relationships to their own tables.

To make it easier for you to understand, let's give a concrete example. Here is a SQL statement to select all the urls that belongs to joe:

SELECT name, url FROM users, urls, url_relations WHERE url_relations.relatedUserId = 1 AND users.userId = 1 AND urls.urlId = url_relations.relatedUrlId

If we want to traverse everyone's personal information and url information, we can do this:

SELECT name, url FROM users, urls, url_relations WHERE users.userId = url_relations.relatedUserId AND urls.urlId = url_relations.relatedUrlId

The fifth level normalized form

There is also a form of first-level regularization, which is uncommon, somewhat esoteric, and unnecessary in most cases. Its principles are:

1. The original table must be reconstructed from the table separated by it.

The advantage of using this rule is that you can ensure that no extra columns are introduced into separate tables, and that all the table structures you create are as large as they actually need. It's a good habit to apply this rule, but unless you're dealing with a very large piece of data, you won't need it.

This is the end of the content of "what are the methods of database regularization and design". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.