In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces the Java development of database design skills, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let Xiaobian with you to understand.
1. Relationship between the original document and the entity
It can be one-to-one, one-to-many, many-to-many relationships. In general, they are an one-to-one relationship: that is, an original document corresponds to only one entity. In special cases, they may be an one-to-many or many-to-one relationship, that is, one original document corresponds to multiple entities, or multiple original documents correspond to one entity.
The entity here can be understood as the basic table. After clarifying this corresponding relationship, it is of great benefit to us to design the input interface.
[example 1]: an employee resume, in the human resources information system, corresponds to three basic tables: employee basic situation table, social relations table, work resume table. This is a typical example of "one original document corresponds to multiple entities".
two。 Primary key and foreign key
Generally speaking, an entity cannot have neither primary key nor foreign key. In the Emurr diagram, the entity in the leaf can define the primary key or not (because it has no children), but it must have a foreign key (because it has a father).
The design of primary key and foreign key plays an important role in the design of global database. When the design of the global database was completed, an American database design expert said, "Keys are everywhere, and there is nothing but keys." this is his experience in database design. it also reflects his highly abstract idea of the core of the information system (data model).
Because: the primary key is a high degree of abstraction of the entity, and the pairing of the primary key and the foreign key represents the connection between entities.
3. The properties of basic tables
The basic table is different from the intermediate table and temporary table because it has the following four characteristics:
Atomicity. The fields in the basic table are not decomposable.
Primitive. The record in the basic table is the record of the original data (basic data).
Deductive. All the output data can be derived from the data in the basic table and the code table.
Stability. The structure of the basic table is relatively stable, and the records in the table should be kept for a long time.
After understanding the nature of the basic table, when designing the database, we can distinguish the basic table from the intermediate table and the temporary table.
4. Standard of paradigm
The relationship between the basic table and its fields should satisfy the third paradigm as far as possible. However, database design that meets the third paradigm is often not the best design. In order to improve the operational efficiency of the database, it is often necessary to reduce the standard of paradigm: appropriately increase redundancy to achieve the purpose of exchanging space for time.
[example 2]: there is a basic table for storing goods, as shown in Table 1. The existence of the "amount" field indicates that the design of the table does not meet the third paradigm, because "amount" can be obtained by multiplying "unit price" by "quantity", indicating that "amount" is a redundant field. However, adding the redundant field "amount" can improve the speed of query statistics, which is the practice of exchanging space for time.
In Rose 2002, there are two types of specified columns: data columns and calculated columns. Columns such as "amount" are called "calculated columns", while columns such as "unit price" and "quantity" are called "data columns".
5. A popular understanding of the three paradigms
A popular understanding of the three paradigms is of great benefit to database design. In database design, in order to better apply the three paradigms, it is necessary to understand the three paradigms popularly (popular understanding is sufficient, not the most scientific and accurate understanding):
The first normal form: 1NF is a constraint on the atomicity of attributes, which requires that attributes are atomic and can not be decomposed again.
The second paradigm: 2NF is the uniqueness constraint on the record, which requires the record to have a unique identity, that is, the uniqueness of the entity.
The third paradigm: 3NF is a constraint on field redundancy, that is, no field can be derived from other fields, and it requires that the field has no redundancy.
No redundant database design can do this. However, a database without redundancy is not necessarily the best database, and sometimes in order to improve operational efficiency, it is necessary to lower the standard of paradigm and retain redundant data properly.
The specific approach is to follow the third normal form in the conceptual data model design, and the work of reducing the normal form standard is considered in the physical data model design. To lower the paradigm is to add fields, allowing redundancy.
6. Be good at identifying and correctly handling many-to-many relationships
If there is a many-to-many relationship between two entities, it should be eliminated. The way to eliminate it is to add a third entity between the two. In this way, the original many-to-many relationship has now become two one-to-many relationships. The attributes of the original two entities should be reasonably assigned to the three entities.
The third entity here is essentially a more complex relationship, which corresponds to a basic table. Generally speaking, database design tools can not recognize many-to-many relationships, but can handle many-to-many relationships.
[example 3]: in the Library Information system, "book" is an entity, and "reader" is also an entity. The relationship between these two entities is a typical many-to-many relationship: a book can be borrowed by multiple readers at different times, and a reader can borrow more than one book. To this end, a third entity should be added between the two, which is called "borrow and return book". Its attributes are: borrowing and returning time, borrowing and returning flag (0 for borrowing book, 1 for returning book), and, in addition, it should also have two foreign keys (the primary key of "book" and the primary key of "reader") so that it can connect with "book" and "reader".
7. The value method of primary key competition
Competition is an inter-table connection tool for programmers, which can be a number string with no physical meaning, which is realized by adding 1 automatically by the program. It can also be a physical field name or a combination of field names. But the former is better than the latter. When the competition is a combination of field names, it is recommended that the number of fields should not be too large, which not only takes up a lot of space, but also has a slow speed.
8. Correct understanding of data redundancy
The repetition of primary and foreign keys in multiple tables does not belong to data redundancy. This concept must be clear. In fact, many people do not know it yet. The repeated occurrence of non-key fields is data redundancy! And it is a kind of low-level redundancy, that is, repetitive redundancy. Advanced redundancy is not the repetition of fields, but the derivation of fields.
[example 4]: in the three fields of "unit price, quantity and amount" in a commodity, "amount" is derived from "unit price" multiplied by "quantity". It is redundancy and a kind of advanced redundancy. The purpose of redundancy is to improve processing speed. Only low-level redundancy will increase the inconsistency of data, because the same data may be entered multiple times from different times, places, and roles. Therefore, we advocate advanced redundancy (derived redundancy) and oppose low-level redundancy (repetitive redundancy).
9. There is no standard answer to the Emurmurr diagram.
The Emuri R diagram of the information system has no standard answer, because its design and drawing method is not unique, as long as it covers the business scope and functional content of the system requirements, it is feasible. On the contrary, it is necessary to modify the Emuri R diagram.
Although it does not have a single standard answer, it does not mean that it can be designed at will. The criteria of a good Emurr diagram are: clear structure, concise association, moderate number of entities, reasonable attribute distribution, and no low-level redundancy.
10. View technology is very useful in database design.
Unlike basic tables, code tables, and intermediate tables, a view is a virtual table that depends on the real table of the data source. View is a window for programmers to use the database, a form of base table data synthesis, a method of data processing, and a means of user data confidentiality.
In order to carry out complex processing, improve operation speed and save storage space, the definition depth of the view is generally no more than three layers. If the three-tier view is still not enough, you should define the temporary table on the view and then define the view on the temporary table. By overlapping the definition over and over again, the depth of the view is unlimited.
For some information systems related to national political, economic, technological, military and security interests, the role of view is more important. After the physical design of the basic table of these systems is completed, the first layer view is established on the basic table immediately. The number and structure of this layer view are exactly the same as the number and structure of the basic table. And it is stipulated that all programmers are only allowed to operate on the view.
Only the database administrator, with a "security key" jointly held by multiple people, can operate directly on the basic table. Ask the reader to think: why?
11. Intermediate tables, reports and temporary tables
An intermediate table is a table that stores statistical data. It is designed for data warehouses, output reports, or query results, and sometimes it does not have primary and foreign keys (except data warehouses). Temporary tables are designed by programmers to store temporary records for personal use. The base table and intermediate table are maintained by DBA, and the temporary table is automatically maintained by the programmer himself.
twelve。 Integrity constraints are shown in three aspects.
Domain integrity: use Check to implement constraints. In the database design tool, when defining the value range of a field, there is a Check button that defines the value city of the field.
Referential integrity: implemented with contention, FK, table-level triggers. User-defined integrity: it is a business rule that is implemented with stored procedures and triggers.
13. The way to prevent database design from patching is the "three less principles".
1. The fewer tables in a database, the better. Only when the number of tables is reduced, can we show that the E-Murray R diagram of the system is few but refined, removes the repetitive superfluous entities, forms a high abstraction of the objective world, carries out systematic data integration, and prevents patching design.
2. The fewer fields that combine primary keys in a table, the better. Because of the function of the primary key, one is to build the primary key index, and the other is to act as the foreign key of the child table, so the number of fields of the combined primary key is less, which saves not only the running time, but also the index storage space.
3. The fewer fields in a table, the better. Only when the number of fields is small, can it be explained that there is no data duplication in the system, and there is little data redundancy, and more importantly, readers are urged to learn "column changing rows", so as to prevent the fields in the child table from being pulled into the main table. leave a lot of spare fields in the main table. The so-called "column change row" is to pull out part of the main table and create a separate child table. This method is very simple, and some people just don't get used to it, don't adopt it, and don't implement it.
The practical principle of database design is to find a suitable balance between data redundancy and processing speed. "three less" is an overall concept, a comprehensive point of view, can not isolate a certain principle.
The principle is relative, not absolute. The principle of "more than three" is definitely wrong. Just imagine: if you cover the same function of the system, the Emurmurr diagram of one hundred entities (a total of 1000 attributes) is certainly much better than that of two hundred entities (a total of 2,000 attributes).
Advocating the principle of "three less" is to make readers learn to use database design technology for systematic data integration. The step of data integration is to integrate the file system into an application database, integrate the application database into a subject database, and integrate the subject database into a global comprehensive database.
The higher the degree of integration, the stronger the data sharing, the less the phenomenon of information isolated island, and the less the number of entities, primary keys and attributes in the global Emuri R diagram of the whole enterprise information system.
The purpose of advocating the principle of "three less" is to prevent readers from using patching technology to constantly add, delete and modify the database, so that the enterprise database becomes a "garbage heap" for randomly designing database tables, or a "miscellaneous yard" for database tables. finally, the basic tables, code tables, intermediate tables and temporary tables in the database are disorganized and countless, resulting in the information system of enterprises and institutions can not be maintained and paralyzed.
Anyone can do the "three more" principle, which is the crooked theory of the "patching method" to design the database. The "three less" principle is a small but refined principle, which requires higher database design skills and art, which can not be achieved by anyone, because this principle is the theoretical basis for putting an end to the use of "patching method" to design databases.
14. The method of improving the Operation efficiency of Database
Under the given system hardware and software conditions, the ways to improve the operation efficiency of the database system are as follows:
In the physical design of the database, reduce the normal form, increase redundancy, use fewer triggers, and use more stored procedures.
When the calculation is very complex, and the number of records is very large (for example, 10 million), the complex calculation must first be outside the database, after the file system calculation and processing in C++ language is completed, and finally added to the table into the database. This is the experience of telecom billing system design.
If you find that a table has too many records, for example, more than 10 million, split the table horizontally. The practice of split horizontally is to divide the record of the table horizontally into two tables with a value competing for the table's primary key as the boundary. If you find that a table has too many fields, for example, more than 80, split the table vertically and split the original table into two tables.
Optimize the database management system DBMS, that is, optimize various system parameters, such as the number of buffers.
When using the data-oriented SQL language for programming, try to adopt the optimization algorithm.
Thank you for reading this article carefully. I hope the article "what are the skills of database design for Java development" shared by the editor will be helpful to you. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.