What are the basic theories of database? 07/02 Update SLTechnology News&Howtos

What are the basic theories of database?

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about the basic theories of the database, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.

1. Database paradigm

The first paradigm: column can not be separated, eg: [contact] (name, gender, phone), a contact has a home phone and a company phone, so this table structure design does not reach 1NF

The second paradigm: there is a primary key to ensure complete dependence. Eg: order schedule [OrderDetail] (OrderID,ProductID,UnitPrice,Discount,Quantity,ProductName), Discount (discount), Quantity (quantity) completely depends on the primary key (OderID,ProductID), while UnitPrice,ProductName only depends on ProductID and does not comply with 2NF

The third paradigm: no transitive dependency (non-primary key column A depends on non-primary key column B, non-primary key column B depends on primary key), eg: order table [Order] (OrderID,OrderDate,CustomerID,CustomerName,CustomerAddr,CustomerCity) the primary key is (OrderID), CustomerName,CustomerAddr,CustomerCity directly depends on CustomerID (non-primary key column), rather than directly depends on the primary key, it depends on the primary key through transmission, so it does not conform to 3NF.

2. What is an anti-pattern

Paradigm can avoid data redundancy, reduce the space of database, and reduce the trouble of maintaining data integrity.

However, through the database paradigm design, there will be more tables involved in the database business, and it may be necessary to query the business tables involved in multi-table join, which will lead to poor performance and is not conducive to sub-database and sub-tables. Therefore, for the sake of performance priority, it may be necessary to use anti-pattern design in the structure of the database, that is, space for time, and adopt the way of data redundancy to avoid associated queries between tables. As for the problem of data consistency, because it is difficult to meet the strong consistency of data, in general, make the stored data as consistent as possible to ensure that the system achieves consistency after a short period of self-recovery and correction.

You need to be cautious about using anti-patterns to design databases. In general, use stylized database design as much as possible, because stylized database design makes the product more flexible and maintains data integrity at the database layer.

Sometimes, the best way to improve performance is to keep redundant data in the same table, and if you can allow a small amount of dirty data, it is a very good way to create a completely separate summary or cache table. For example, designing a "download times table" to cache the download times information can improve the speed of querying the total number of information in the case of massive data.

In another typical scenario, columns of BLOB and TEXT types may be used to store JSON structure data for extensibility reasons. The advantage is that new properties can be added to this field at any time without the need to change the table structure. However, the disadvantage of this design is also obvious, that is, the whole field content needs to be decoded to obtain the specified properties, and can not be indexed, sorted, aggregated and other operations. Therefore, if you need to consider more complex usage scenarios, it is more recommended to use a document database such as MongoDB.

3. Database transaction

Transaction is not only an indivisible sequence of database operations, but also the basic unit of database concurrency control. The result of its execution must change the database from one consistency state to another.

(1)。 Characteristics of the transaction

Atomicity (Atomicity): a series of database operations contained in a transaction are either performed successfully or rolled back

Consistency (Consistency): the execution result of a transaction must move the database from one consistency state to another

Isolation: transactions executed concurrently cannot influence each other

Durability: once a transaction is committed, changes to the data in the database are permanent.

(2)。 Problems caused by transaction concurrency

Dirty read: one transaction reads uncommitted data from another transaction

Non-repeatable reading: the focus of non-repeatable reading is modification. Under the same conditions, the results of the two reads are different, that is, the data to be read can be modified by other transactions.

Phantom reading: the focus of phantom reading is to add or delete, under the same conditions, the number of records read out is not the same.

(3)。 Isolation level

The isolation level determines how transactions in one session may affect transactions in another session.

The ANSI standard defines four isolation levels, all of which are supported by MySQL's InnoDB, which are:

READ UNCOMMITTED (uncommitted reads): the lowest level of isolation, commonly known as dirty read, allows one transaction to read data that another transaction does not have commit, which may improve performance, but can lead to dirty read problems.

READ COMMITTED (commit read): only records that are already commit of other transactions are allowed to be visible in one transaction, and this isolation level cannot avoid the problem of unrepeatable reads.

REPEATABLE READ (readable): after the start of a transaction, database modifications made by other transactions are not visible in this transaction until this transaction commit or rollback. However, the insert/delete operations of other transactions are visible to that transaction, that is, the isolation level does not avoid phantom reading problems. The result of repeating select in a transaction is the same, except for the update database in this transaction.

SERIALIZABLE (serializable): the highest level of isolation, allowing only transactions to be executed serially.

The default isolation level for MySQL is REPEATABLE READ.

4. what is a stored procedure? What are the advantages and disadvantages?

A stored procedure is a collection of SQL statements that are pre-compiled and stored in the database. Further, a stored procedure is a code block made up of T-SQL statements that perform some functions like a method (additions, deletions, modifications and queries for single or multiple tables), and then give the code block a name and call it when you use this function. Stored procedures have the following characteristics:

4.1. stored procedures are compiled only when they are created, and do not need to be recompiled every time they are executed, while SQL statements are usually compiled every time they are executed, so using stored procedures can improve the efficiency of database execution.

4.2.When the SQL statement changes, you can only modify the stored procedures in the database without modifying the code

4.3. reduce network transmission. Calling a stored procedure on the client is of course less than performing a string of SQL transmissions.

4.4. Through the stored procedure, the unauthorized users can access the database indirectly under control, so as to ensure the security of the data.

5. Briefly talk about the difference between drop, delete and truncate

Drop, delete, and truncate in SQL all indicate deletion, but there are some differences among them:

Delete is used to delete all or part of the data rows of the table. After executing delete, the user needs to commmit or rollback to delete or undelete. The delete command will trigger all delete triggers on the table.

Truncate deletes all data from the table. This operation cannot be rolled back and triggers on the table. TRUNCATE is faster and takes up less space than delete.

The Drop command deletes the table from the database, all data rows, indexes, and permissions are deleted, all DML triggers are not triggered, and this command cannot be rolled back.

Therefore, when you no longer need a table, use drop; when you want to delete some rows of data, use delete; to keep the table and delete all data using truncate.

6. What is a view? What is the cursor?

A view is a virtual table, which is usually a subset of rows or columns of one or more tables, which has the same function as the physical table, and can add, delete, modify, check and other operations on the view. In particular, changes to the view do not affect the underlying table. It makes it easier for us to get data than multi-table queries.

Cursors effectively deal with the queried result set as a unit. The cursor can be positioned on a specific row in the cell to retrieve one or more rows from the current row of the result set. You can modify the current row of the result set. Cursors are generally not used, but cursors are very important when you need to process data one by one.

When operating on mysql, we know that the MySQL retrieval operation returns a set of rows called the result set. This set of returned rows are all rows that match the SQL statement (zero or more rows). Using simple SELECT statements, for example, there is no way to get the first, next, or first 10 rows, and there is no simple way to process all rows one at a time (as opposed to processing them in batches). Sometimes you need to move forward or backward one or more lines in the retrieved rows. This is why cursors are used. A cursor is a database query stored on a MySQL server. It is not a SELECT statement, but a result set retrieved by that statement. After the cursor is stored, the application can scroll or browse the data in it as needed. Cursors are mainly used for interactive applications, where users need to scroll through the data on the screen and browse or make changes to the data.

7. What is a trigger?

A trigger is a database object associated with a table that fires when a definition condition is met and executes a collection of statements defined in the trigger. This feature of triggers can help the application on the database side to ensure the integrity of the database.

8. Superkeys, candidate keys, primary keys, foreign keys

Superkey: a set of attributes that uniquely identifies a tuple in a relationship is called a hyperkey in a relational schema. An attribute can be used as a superkey, or multiple attributes can be combined as a superkey. The superkey contains a candidate key and a primary key.

Candidate key: is the minimum superkey, that is, a superkey with no redundant elements.

Primary key: a combination of data columns or attributes in a database table that uniquely and fully identify the stored data object. A data column can have only one primary key, and the value of the primary key cannot be missing, that is, it cannot be null (Null).

Foreign key: the primary key of another table that exists in one table is called the foreign key of this table.

9. What is a transaction? What is a lock?

Transaction: a grouping of SQL statements bound together as a logical unit of work. If any statement operation fails, the entire operation fails, and later the operation is rolled back to the pre-operation state, or there is a node on it. Transactions can be used to ensure that either they are executed or not. To consider grouped statements as transactions, you need to pass ACID tests, that is, atomicity, consistency, isolation, and persistence.

Locks: in all DBMS, locks are the key to implementing transactions, and locks can ensure the integrity and concurrency of transactions. Like locks in real life, it can prevent the owner of some data from using certain data or data structures for a certain period of time. Of course, there are different levels of locks.

10. Database locking mechanism

To put it simply, the database locking mechanism is a kind of rule designed by the database to make all kinds of shared resources accessed concurrently and access orderly in order to ensure the consistency of data. Each storage engine of MySQL uses three types of locking mechanisms: row-level locking, page-level locking, and table-level locking.

Table-level locking (table-level): table-level locking is the largest granularity locking mechanism among MySQL storage engines. The most important feature of the locking mechanism is that the implementation logic is very simple and the negative impact of the system is the least. So the speed of acquiring and releasing locks is very fast. Because table-level locks lock the entire table at once, it is a good way to avoid the deadlock problem that bothers us. Of course, the biggest negative effect of large locking granularity is that the probability of locking resource contention will be the highest, resulting in magnanimity greatly reduced. Table-level locks are divided into read locks and write locks. Page-level locking (page-level): the characteristic of page-level locking is that the lock granularity is between row-level lock and table-level lock, so the resource overhead required to acquire lock and the concurrent processing power that can be provided are also between the above two. In addition, page-level locking is the same as row-level locking, deadlocks occur. Row-level locking (row-level): the biggest feature of row-level locking is that the granularity of locked objects is very small, and it is also the smallest locking granularity implemented by major database management software at present. Because the locking granularity is very small, the probability of locking resource contention is the smallest, which can give applications as much concurrent processing ability as possible and improve the overall performance of some applications that need high concurrency. Although it has great advantages in concurrent processing ability, row-level locking also brings a lot of disadvantages. Because the granularity of locking resources is very small, more needs to be done each time the lock is acquired and released, and the consumption is naturally greater. In addition, row-level locking is also the most prone to deadlocks. InnoDB row-level locks are also divided into two types, shared locks and exclusive locks. Similarly, InnoDB also introduces the concept of intentional lock (table-level lock), so there are intentional shared lock and intentional exclusive lock, so InnoDB actually has four kinds of locks, namely shared lock (S), exclusive lock (X), intention shared lock (IS) and intention exclusive lock (IX).

In MySQL database, table-level locking is mainly used by some non-transactional storage engines such as MyISAM,Memory,CSV, while row-level locking is mainly used by Innodb storage engine and NDBCluster storage engine, and page-level locking is mainly the locking mode of BerkeleyDB storage engine.

The function of intention lock is that when a transaction needs to obtain a resource lock, if the resource it needs has been occupied by an exclusive lock, the transaction can need to add an appropriate intention lock to the table that locks the row. If you need a shared lock, add an intended shared lock on top of the table. If you need to add an exclusive lock on a row (or some rows), add an intention exclusive lock on the table first. Multiple intention sharing locks can exist at the same time, but only one intention exclusive lock can exist at the same time.

After reading the above, do you have any further understanding of the basic theory of database? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.