Database-indexes, triggers, transactions (storage engine) 02/13 Update SLTechnology News&Howtos

Database-indexes, triggers, transactions (storage engine)

2026-02-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

one。 Database

Database (DataBase) is a warehouse that organizes, stores and manages data according to the data structure. Its main characteristics are as follows:

Realize data sharing

Data sharing includes that all users can access the data in the database at the same time, including that users can use the database through interfaces in various ways, and provide data sharing.

Reduce the redundancy of data

Compared with the file system, because the database realizes data sharing, it avoids the establishment of application files by users. It reduces a lot of duplicate data, reduces data redundancy, and maintains data consistency.

Independence of data

The independence of data includes logical independence (the logical structure of the database and the application are independent of each other) and physical independence (the change of the physical structure of the data does not affect the logical structure of the data).

Data to achieve centralized control

In the mode of file management, the data is in a decentralized state, and there is no relationship between different users or the same user in different processing. The data can be centrally controlled and managed by the database, and the organization of all kinds of data and the relationship between the data can be represented by the data model.

Data consistency and maintainability to ensure data security and reliability

It mainly includes: ① security control: to prevent data loss, error update and unauthorized use; ② integrity control: to ensure the correctness, validity and compatibility of data; ③ concurrency control: to allow multiple access to data in the same time period, while preventing abnormal interactions between users.

Fault recovery

A set of methods are provided by the database management system, which can find and repair faults in time, so as to prevent data from being destroyed. The database system can recover the failure when the database system is running as soon as possible, which may be a physical or logical error. For example, data errors caused by misoperation of the system, etc.

two。 Indexes

An index is a data structure that sorts the values of one or more columns in a database table, and is a database object used to improve the speed of accessing data in a database table. In fact, an index is equivalent to a book's catalogue. if there is no index, you need to traverse the entire database table to find a specific value in the database, but after you have an index, you can find it in the index, which helps to get information more quickly.

Indexes can be divided into clustered indexes and nonclustered indexes; for clustered indexes: in order according to the physical location where the data is stored, while in non-clustered indexes, the storage order of table data is independent of the index order; only one clustered index can be created on a table, because the physical order of real data can only be one. If a table does not have a clustered index, it is called a "heap". The rows in the table are not in a specific order, and all new rows are added to the end of the table.

The basic information contained in an index record is: key values (values of all fields specified when defining the index) + logical pointers (pointing to data pages or another index page); depending on the functionality of the database, three types of indexes can be created in the database designer:

Unique index

A unique index is an index that does not allow any two rows to have the same index value.

When there are duplicate key values in existing data, most databases do not allow the newly created unique index to be saved with the table. The database may also prevent the addition of new data that will create duplicate key values in the table. For example, if a unique index is created on the employee's last name (lname) in the employee table, no two employees can have the same last name

The statements used are:

CREATE UNIQUE INDEX index name ON table name (column name, if multiple columns are separated by commas)

For the creation of a simple index, simply remove the UNIQUE from the unique index

Primary key index

Database tables often have a combination of one or more columns whose values uniquely identify each row in the table. The column is called the primary key of the table

Defining a primary key for a table in a database diagram automatically creates a primary key index, which is a specific type of unique index. The index requires that each value in the primary key be unique. It also allows quick access to data when primary key indexes are used in queries

Clustered index

In a clustered index, the physical order of the rows in the table is the same as the logical (index) order of the key values. A table can contain only one clustered index; if an index is not a clustered index, the physical order of rows in the table does not match the logical order of key values. Clustered indexes usually provide faster data access than nonclustered indexes.

Although the purpose of indexing is to speed up the lookup or sorting of records in a table, there is a price to pay for setting an index for a table: first, it increases the storage space of the database; second, it takes more time to insert and modify data (because the index changes accordingly). A database index is a catalog of values in certain fields to improve the efficiency of table search; each has its own advantages and disadvantages:

Advantages: creating indexes can greatly improve the performance of the system

By creating a uniqueness index, the uniqueness of each row of data in the database table can be guaranteed.

It can greatly speed up the retrieval of data, which is the main reason for creating an index.

The connection between the meter and the table can be accelerated, especially in achieving the reference integrity of the data.

When using grouping and sorting clauses for data retrieval, the time of grouping and sorting in a query can also be significantly reduced.

Fifth, through the use of index, we can use the optimization hidden device in the process of query to improve the performance of the system.

Disadvantages: adding an index also has many disadvantages

It takes time to create and maintain an index, which increases as the amount of data increases

The index needs to occupy the physical space, in addition to the data table occupies the data space, each index also takes up a certain amount of physical space, if you want to establish a clustered index, then the space required will be more.

When the data in the table is added, deleted and modified, the index should also be maintained dynamically, which reduces the speed of data maintenance.

Therefore, the use and establishment of the index should depend on the situation, for example, it is not necessary to establish an index for those columns with few queries or relatively few data values, because it can not improve the query speed. On the contrary, it will consume a certain amount of space and reduce the maintenance of the system.

three。 Business

A Database Transaction is a series of operations that are performed as a single logical unit of work, either completely or not at all. Transaction processing ensures that data-oriented resources are not permanently updated unless all operations within the transactional unit are completed successfully. By combining a set of related operations into a unit that either succeeds or fails all, you can simplify error recovery and make the application more reliable.

There are the following related properties for transactions:

Atomicity (Atomic) (Atomicity)

Transactions must be atomic units of work; either all or none of their data modifications are performed. In general, the operations associated with a transaction have a common goal and are interdependent; if the system performs only a subset of these operations, it may undermine the overall goal of the transaction, and atomicity eliminates the possibility that the system will process a subset of operations

Consistency (Consistent) (Consistency)

All data must be kept in a consistent state when the transaction is completed. In the relevant database, all rules must be applied to the modification of the transaction to maintain the integrity of all data. At the end of the transaction, all internal data structures (such as B-tree indexes or two-way linked lists) must be correct

Isolation (Insulation) (Isolation)

Changes made by concurrent transactions must be isolated from changes made by any other concurrent transactions. The state in which the data is in when the transaction views the data, either the state before it is modified by another concurrent transaction or after it is modified by another transaction, and the transaction does not view the data in the intermediate state. This is called isolation because it can reload the starting data and replay a series of transactions so that the state at the end of the data is the same as that in which the original transaction was executed. The highest isolation level is obtained when the transaction is serializable. At this level, the results obtained from a set of transactions that can be executed in parallel are the same as those obtained by running each transaction continuously. Because a high degree of isolation limits the number of transactions that can be executed in parallel, some applications lower the isolation level in exchange for greater throughput

Persistence (Duration) (Durability)

After the transaction is completed, its impact on the system is permanent. The modification will be maintained even in the event of a fatal system failure.

Three models of transactions:

An implicit transaction means that every data operation statement automatically becomes a transaction, the beginning of the transaction is implicit, and the end of the transaction is clearly marked.

Explicit transactions are transactions with explicit start and end tags, and each transaction has explicit start and end tags

Automatic transactions are automatically defaulted by the system, and start and end are not marked.

Statements that use transactions:

Start things: BEGIN TRANSACTION

Submit things: COMMIT TRANSACTION

Rollback transaction: ROLLBACK TRANSACTION

SavePoint for the transaction:

SAVE TRANSACTION SavePoint name-Custom SavePoint name and location

ROLLBACK TRANSACTION SavePoint name-rollback to a custom SavePoint

Storage engine:

Data in MySQL is stored in files (or memory) using a variety of different technologies, each of which uses different storage mechanisms, indexing techniques, locking levels, and ultimately provides a wide range of functions and capabilities, by selecting different technologies, additional speed or functionality can be achieved, thereby improving the overall functionality of the application. These different technologies and associated functions are called storage engines (also known as table types) in MySQL.

MySQL is configured with many different storage engines by default, which can be pre-set or enabled in the MySQL server. You can choose a storage engine for servers, databases, and tables to provide maximum flexibility when choosing how to store your information, how to retrieve it, and what performance and functionality you need to combine the data with.

Here are several commonly used storage engines:

MyISAM

Characteristics

Transactions are not supported: the MyISAM storage engine does not support transactions, so business scenarios that require transactions cannot be used

Table-level locking: its locking mechanism is a table-level index, which can make the cost of locking very low, but also greatly reduce its concurrency performance.

Reads and writes block each other: not only does MyISAM block reads while writing, but MyISAM also blocks writes while reading, but reads themselves do not block other reads

Only indexes can be cached: MyISAM can use key_buffer caching to greatly improve access performance and reduce disk IO, but this cache only caches indexes, not data

Applicable scenario

No transaction support is required (not supported)

Relatively low concurrency (locking mechanism issues)

Relatively few data modifications (blocking problem)

Give priority to reading

The requirement of data consistency is not very high.

Best practic

Index as much as possible (caching mechanism)

Adjust the priority of reading and writing to ensure that important operations are given priority according to actual needs.

Enabling delayed insertion to improve the performance of bulk writes

Operate sequentially as much as possible so that insert data are written to the tail to reduce blocking

Decompose large operations to reduce the blocking time of a single operation

Reduce the number of concurrency, and some high concurrency scenarios are queued through applications.

For relatively static data, making full use of Query Cache can greatly improve the access efficiency.

The Count of MyISAM is particularly efficient only when scanning full tables, and count with other conditions requires actual data access.

InnoDB

Characteristics

Good transaction support: support 4 transaction isolation levels and support multi-version reading

Row-level locking: through the index, the full table scan will still be a table lock, pay attention to the impact of gap locks

Read-write blocking is related to transaction isolation level

It has very efficient caching features: it can cache indexes as well as data.

The whole table and primary key are stored in Cluster to form a balance tree.

All Secondary Index will save primary key information

Applicable scenario

Transaction support is required (with good transaction characteristics)

Row-level locking is well adapted to high concurrency, but you need to ensure that the query is done through the index

Scenarios where data is updated more frequently

High requirements for data consistency

The memory of the hardware device is large, so the better cache ability of InnoDB can be used to improve the memory utilization and reduce the disk IO as much as possible.

Best practic

The primary key should be as small as possible to avoid excessive space burden on Secondary index.

Avoid full table scans because table locks are used

Cache all indexes and data as much as possible to improve response speed

When inserting in large quantities and small ones, try to control the transaction yourself instead of using autocommit autocommit.

Set innodb_flush_log_at_trx_commit parameter values reasonably and don't pursue security excessively.

Avoid primary key updates, as this can lead to a lot of data movement

NDBCluster

Characteristics

Distributed: distributed storage engine, which can be clustered by multiple NDBCluster storage engines to store part of the overall data.

Supporting transactions: like Innodb, supporting transactions

Can be separated from mysqld on a separate host: can exist separately from mysqld on a separate host, and then communicate with mysqld through the network

The memory demand is huge: the new version of the index and the indexed data must be stored in memory, and all data and indexes of the old version must be stored in memory.

Applicable scenario

Have very high concurrency requirements

The response to a single request is not very critical

The query is simple, the filtering conditions are relatively fixed, the amount of data per request is small, and you do not want to conduct horizontal Sharding by yourself.

Best practic

Make the query as simple as possible to avoid cross-node transmission of data

As far as possible to meet the computing performance of SQL nodes, larger cluster SQL nodes will obviously have more Data nodes.

Interconnect nodes in a 10 Gigabit network environment as much as possible to reduce the delay of data transmission at the network layer

Note: the above three storage engines are currently relatively mainstream storage engines, and other storage engines such as Memory,Merge,CSV,Archive are relatively rare, so we will not analyze them one by one here.

In addition, you can use SHOW TABLE STATUS FROMDBname to view the engine of the tables in the current database

When creating a new table, you can tell MySQL what type of table to create by using the ENGINE or TYPE option in the CREATE statement:

CREATE TABLE t (I INT) ENGINE = INNODB

CREATE TABLE t (I INT) TYPE = MEMORY

If the ENGINE or TYPE options are omitted, the default storage engine is used. When MySQL is installed on the Windows platform using the MySQL configuration wizard, the InnoDB storage engine replaces the MyISAM storage engine as the default. When an unavailable type is specified, it is automatically replaced with an InnoDB table.

You can also move a table from one type to another, using the alter table statement:

ALTER TABLE t ENGINE = MYISAM

ALTER TABLE t TYPE = BDB

four。 Trigger

A trigger is a special type of stored procedure that takes effect automatically when the data in a specified table changes; a trigger is a special transaction unit that can reference columns in other tables to perform special business rules or data logic relationships. When an error occurs, you can perform a rollback transaction operation to roll back the entire trigger and the T-SQL statement that triggered it (without displaying the declaration begin transaction); wake up the trigger to respond to the INSERT, UPDATE, or DELETE statement. Triggers can query other tables and can contain complex Transact-SQL statements. Treat the trigger and the statement that triggered it as a single transaction that can be rolled back within the trigger. If a serious error is detected (for example, insufficient disk space), the entire transaction is automatically rolled back, that is, undone.

There are two types of triggers:

AFTER triggers: this trigger will not be triggered until the data changes (insert, update, delete actions) have been completed. Check the changed data and reject or roll back the changed data if an error is found

INSTEAD OF trigger: this trigger is triggered before the data changes and replaces the operations that change the data (insert, update, delete operations) to perform the actions defined by the trigger

When you create a trigger, you must also specify trigger actions: insert, update, delete operations, at least one, or multiple

Create a trigger:

CREATE TRIGGER trigger_name// trigger name ON {table | view} / / Table or view executed on it [WITH ENCRYPTION] / / prevents triggers from publishing {{FOR | AFTER | INSTEAD OF} / / trigger categories as part of SQL Server Decide whether after or instead of {[INSERT] [,] [DELETE] [,] [UPDATE]} / / specifies the keyword [WITH APPEND] [NOT FOR REPLICATION] / / that fires the trigger when the replication process changes the table involved in the trigger. The action to be performed by the trigger AS// trigger should not be performed [{IF UPDATE (column) / / Test INSERT or UPDATE operations on the specified column and cannot be used for DELETE operations. You can specify multiple columns. [{AND | OR} UPDATE (column)] [... n] | IF (COLUMNS_UPDATED () {bitwise_operator// bit operator} updated_bitmask) / / Test whether the mentioned column is inserted or updated Only used in UPDATE and INSERT triggers {comparison_operator// comparison operator} column_bitmask [... n]}] sql_statement [... n] / / SQL Server does not support including all create statements, DROP statements, etc.} in triggers

"finish"

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.