In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces how to choose Postgresql and MYSQL. It has certain reference value. Interested friends can refer to it. I hope you will gain a lot after reading this article. Let Xiaobian take you to understand it together.
I often see someone write about locks, but I often feel that it gives people a feeling that the ACID of the database is controlled by locks. In fact, the ACID control of the database is complex. MVCC is an effective method to improve concurrent access to resources.
Serializability is assumed when the ACID transaction attribute is first defined. To provide strictly serializable transaction results, a 2PL (two-phase locking) mechanism is used. With 2PL, a shared lock acquisition is required for each read operation and an exclusive lock is required for each write operation.
As an important rule of thumb, maintaining data integrity, all modifications to transactional DBs should be done under atomic transactions. Furthermore, every transaction should keep the database in a consistent state, and isolation is the hardest practice. In theory, quite simply, it isolates all transactions so that they execute "as if" they were executed serially, even if they run simultaneously. But in practice, it's a lot more complicated--maintaining reasonable performance while maintaining isolation.
Multiversion concurrency control (MVCC), which creates a "previous version" of a row (a "snapshot") and provides the row's "previous version" to any other transactions that might try to run concurrently, rather than locking the row when someone starts reading it. This makes sense--after all, you don't think about changing the state of DB until you commit the first transaction.
At this point, it will be boring, so let's start talking about the MVCC implementation with the actual database edge.
There are two general ways to solve MVCC based on the database types currently available
New data is separated from old data and moved to a place, such as undo log, where other people read the data and read the old data out of the rollback segment, as the innodb engine in Oracle and MySQL does.
When writing new data, the old data is not deleted, but the new data is inserted, and the old and new data are together. PostgreSQL uses this implementation method.
So we can compare the difference between these two ways.
1 Postgresql solves the MVCC problem by row design and xact. We can see the principle by querying xmin,xmax,cmin,cmax in a table.
The following code explains some of the principles of tuple design on PG
typedef struct HeapTupleFields { TransactionId t_xmin; /* inserting xact ID *
/TransactionId t_xmax; /* deleting or locking xact ID *
/union{ CommandId t_cid; /* inserting or deleting command ID, or both *
/ TransactionId t_xvac; /* VACUUM FULL xact ID */} t_field3; } HeapTupleFields;
t_xmin represents the transaction ID that generated this row or higher
t_xmax represents the transaction ID that deletes or locks this tuple
t_cid contains two fields, cmin and cmax, that identify the order in which the lines are to be inserted within a transaction, e.g. insert 5 lines, what is the order in which the 5 lines are inserted, which tuples are visible to which tuples, this is a display of visibility at the transaction level.
t_xvac stores the transaction ID of the VACUUM FULL command
When a row is inserted, postgres stores the XID in that row and calls it xmin. Every row that has committed and xmin is less than the XID of the current transaction is visible to the transaction. This means that you can start a transaction and insert a row that no other transaction will see until the transaction commits. Once the other transactions are committed and created, they are able to view the new rows because they satisfy the xmin < XID condition--and the transaction that created the row has completed.
Let's take a look at the postgresql table structure, using the city table as an example
Tableoid,cmax,xmax,cmin,xmin
select attname, attnum, atttypid::regtype, attisdropped::text from pg_attribute where attrelid = 'city'::regclass;
Let's give an example to explain the specific operation of MVCC
We select a city table and we open two things, one to update city_id 1 - 20 and the other to update city_id 21 40
transaction 1
transaction 2
Not seen in Transaction 1 and Transaction 2
From the above we can conclude
1 Every transaction change triggers xmin xmax , change
2 Changes to each transaction xmin will only be seen inside its own transaction, while xmax is the information flag that other transactions are changing.
This allows the initial functionality of MVCC to proceed, so postgresql has no page locks, only table locks and row locks.
The advantage of this is that transactions roll back very quickly, but require frequent vacuum
MYSQL MVCC uses undo log, which is similar to ORACLE. MVCC's function is not implemented in each line. innodb storage engine manages undo in segments. rollback segments are called rollback segments. There are 1024 undo log segments in each rollback segment. (MYSQL 8 has changed)
Inactive all rollback segments (rsegs) to reside in the selected UNDO table space. Inactive means that these rollback segments are not assigned to new transactions. The cleanup system continues to release rollback segments that are no longer needed. This marks the pages allocated to the rollback segment as free and reduces the logical size of the rollback.
Several questions can be asked by following the general flow of an UNDO table space above
1 rollback segment is limited in number, the number of rollback segments is limited to the number of transactions that can be executed at the same time in this database system, each rollback segment maintains a page header, each page will be divided into 1024 slots, each slot will correspond to a transaction, so MYSQL 5.7 (8.0 redesigned UNDOLOG) In addition, even if it is a read-only transaction, as long as there is a temporary table write, it is also allocated rollback segment.
For example, in the prepare phase of MYSQL transactions, insert undo and update undo are in the state of prepare, call trx_undo_set_state_at_prepare, set TRX_UNDO_STATE of the page segment header to TRX_UNDO_PREPARED for the corresponding undo log slot header page (trx_undo_t::hdr_page_no), and modify other corresponding fields at the same time.
In the commit phase, if the Undo state is TRX_UNDO_CACHED, it will be added to the insert_undo_cached list of the rollback segment, or all the segments occupied by the undo and the slots occupied by the rollback segment will be released, the size of the current rollback segment will be modified, and the memory occupied by the undo object will be released. If it is an Update_undo operation, the insert_undo will not be placed on the History list. Decrement the rollback segment counter by one after the last transaction commits.
In fact, it is to connect the transaction ID and the pointer of the rollback segment. At the same time, there are two fields in the MYSQL row to record. For each row of MYSQL table, there are 6 bytes of db_trx_id , 7 bytes of db_roll_ptr ,undo log. For update or delete operations, each row stores a transaction Id. Modify the transaction Id to the transaction id of the current Session, generate the version of the data row before the transaction, and point the rollback pointer of the current row to the version before the transaction. For insert operations, the rollback pointer for the current row is pointed to null because insert has no version prior to the transaction operation.
If the database wants to rollback during the execution of transactions, it must consider concurrency and rollback, which causes more disk space to be occupied with the concurrent and rollback requirements, and after the transaction is committed, it needs to clean up these useless things. POSTGRESQL is called VACUUM, MYSQL is called Purge, and in InnoDB, the latest version of the updated row is only retained in the table. Older versions of rows are rolled back in segments, while deleted row versions are left in place and marked for later cleanup. Therefore, any deleted rows marked must be cleaned from the table itself and any updated old version rows must be purged from the rollback segment. All the information needed to find deleted records.
Therefore, from the design structure, the structural design of postgresql is simple, MYSQL ORACLE is complex, and POSTGRESQL has no redo structure, so the biggest problem for POSTGRESQL is VACUUM, while MYSQL INNODB will face I/O pressure in redo ,undo, purge, etc.
Purely personal, postgresql performance bottleneck without vacuum is less than MYSQL complexity impact (some other recycling can be done during periods of infrequent work). Postgresql has to give disk space in use, mysql is better in this respect.
So simply saying that better, it doesn't mean anything. What matters is how much of their characteristic knowledge you have mastered.
Thank you for reading this article carefully. I hope that the article "Postgresql and MYSQL How to Choose" shared by Xiaobian will be helpful to everyone. At the same time, I hope that everyone will support you a lot and pay attention to the industry information channel. More relevant knowledge is waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 248
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
This article mainly introduces "how C++ uses span or span_p". In daily operation, I believe many people have doubts about how C++ uses span or span_p. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation, hoping to answer them.
© 2024 shulou.com SLNews company. All rights reserved.