What is the MySQL write collection? 04/17 Update SLTechnology News&Howtos

What is the MySQL write collection?

2025-04-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "what is the MySQL write set". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. What is a Write set

The write set is actually defined in the class Rpl_transaction_write_set_ctx, which mainly contains two data structures

Std::vector write_set

Std::set write_set_unique

The first is a vecotr array, and the second is a set collection, each of which is a hash value whose hash is derived from the function add_pke and contains:

Non-unique index name + delimiter + library name + delimiter + library name length + table name + delimiter + table name length + index field 1 numeric value + delimiter + index field 1 length [+ index word 2 segment value + delimiter + index field 2 length.]

Note that the unique index is also counted in the write collection.

In MGR, the primary key plays a very important role and is an important basis for judging whether there is a conflict or node. finally, the write set information will be encapsulated into Transaction_context_log_event and sent to other nodes together with other binlog event information. At the same time, the function add_pke also records two formats for each row index value when generating the original data of the write collection member (the data before hash):

Field values and lengths in terms of MySQL field format

Field value and length recorded in string format

The write set is generated before the Innodb layer completes the change operation and the MySQL layer writes to binlog event.

Second, write the columns of the original data (before hash)

The table is as follows:

Mysql > use testDatabase changedmysql > show create table jj10\ gateway * 1. Row * * Table: jj10Create Table: CREATE TABLE `jj10` (`id1` int (11) DEFAULT NULL, `id2` int (11) DEFAULT NULL, `id3` int (11) NOT NULL, PRIMARY KEY (`id3`), UNIQUE KEY `id1` (`id1`) KEY `id2` (`id2`) ENGINE=InnoDB DEFAULT CHARSET=latin11 row in set (0.00 sec)

We write a row of data:

Insert into jj10 values (36. 36.)

This row of data generates a total of four write collection elements, respectively:

Note: the half shown here is a delimiter

Write collection element 1:

(gdb) p pke$1 = "PRIMARYqualified test 4jj10004\ 200\ 000\ 000 $2 004" Note: 3 octets + ASCII$ hexadecimal is 0X80 00 24.

Primary key PRIMARY+ delimiter + library name test+ delimiter + library name length 4 + table name jj10+ separator + table name length 4 + primary key value 0X80 00 00 24 + delimiter + int field type length 4

Write collection element 2:

(gdb) p pke$2 = "PRIMARY to test / 4jj10 / 436 / 2"

Primary key PRIMARY+ delimiter + library name test+ delimiter + library name length 4 + table name jj10+ separator + table name length 4 + primary key value string shows "36" + delimiter + string "36" length is 2

Write collection element 3:

(gdb) p pke$3 = "id1roomtestworthy 4jj10004\ 200\ 000\ 000 $4"

It's the same as above, but here is not the primary key, but the only key id1.

Write collection element 4:

(gdb) p pke$4 = "id1 to test / 4jj10 / 436 / 2"

It's the same as above, but here is not the primary key, but the only key id1.

Third, function add_pke parsing

Leaving aside the logic of foreign keys, the main logic is as follows:

If there is an index in the table: write the database name, table name information to the temporary variable loop scan each index in the table: if it is not a unique index: exit the cycle and continue the cycle. Loop two ways to generate data (MySQL format and string format): write the index name to pke. Write temporary variable information to pke. Loop through each field in the index: write the information for each field to the pke. If the field scan is complete: pke generates a hash value and writes it to the write collection.

The source code is annotated as follows:

Rpl_transaction_write_set_ctx* ws_ctx= / / THD Transaction_ctx m_transaction_write_set_ctx thd- > get_transaction ()-> get_transaction_write_set_ctx (); / / this memory space is allocated during thread initialization m_transaction (new Transaction_ctx ()), int writeset_hashes_added= 0; if (table- > key_info & & (table- > s-> primary_key)

< MAX_KEY)) //typedef struct st_key { char value_length_buffer[VALUE_LENGTH_BUFFER_SIZE]; char* value_length= NULL; std::string pke_schema_table; pke_schema_table.reserve(NAME_LEN * 3); pke_schema_table.append(HASH_STRING_SEPARATOR); //分隔符 pke_schema_table.append(table->

S-> db.str, table- > s-> db.length); / / the database name is stored. Pke_schema_table.append (HASH_STRING_SEPARATOR); / / the delimiter value_length= my_safe_itoa (10, table- > s-> db.length, & value_length_ buffer [value _ LENGTH_BUFFER_SIZE-1]); / / stores the length of the character form returned as the char pointer'1''3' represents the length of 13 pke_schema_table.append (value_length) / / the converted length is stored in pke_schema_table.append as a string (table- > s-> table_name.str, table- > s-> table_name.length); / / the table name character is stored. Pke_schema_table.append (HASH_STRING_SEPARATOR); / / the delimiter value_length= my_safe_itoa (10, table- > s-> table_name.length, & value_length_ buffer [value _ LENGTH_BUFFER_SIZE-1]); / / stores the length of the character form returned as the char pointer'1''3' represents the length of 13 pke_schema_table.append (value_length) / / the converted length is stored as a string / / so the above stored delimiter + dbname+ delimiter + dbname length + tablename+ separator + tablename length represents the database and table information std::string pke; / / initialization pke, which is the intermediate variable pke.reserve (NAME_LEN * 5) that stores the data before writing to the collection element hash; char * pk_value= NULL Size_t pk_value_size= 0; / / Buffer to read the names of the database and table names which is less / / than 1024. So its a safe limit. Char name_read_ buffer [name _ READ_BUFFER_SIZE]; / / Buffer to read the row data from the table record [0]. String row_data (name_read_buffer, sizeof (name_read_buffer), & my_charset_bin); / / read the current row data to buffer#ifndef DBUG_OFF / / if the non-DEBUG schema std::vector write_sets;#endif for is not defined (uint key_number=0; key_number

< table->

S-> keys; key_number++) / / scan each index EXP:create table jj10 (id1 int,id2 int,id3 int primary key,unique key (id1), key (id2)) in turn {/ / table- > key_info [0] .name $12 = 0x7fffd8003631 "PRIMARY" able- > key_info [1] .name $13 = 0x7fffd8003639 "id1" / / Skip non unique. / / table- > key_info [2] .name $14 = 0x7fffd800363d "id2" if (! ((table- > key_ info [key _ number]. Flags & (HA_NOSAME)) = = HA_NOSAME) / / Skip non-unique KEY continue; / * To handle both members having hash values with and without collation in the same group, we generate and send both versions (with and without collation) of the hash in the newer versions. This would mean that a row change will generate 2 instead of 1 writeset, and 4 instead of 2, when competes for are involved. This will mean that a transaction will be certified against two writesets instead of just one. To generate both versions (with and without collation) of the hash, it first converts using without collation support algorithm (old algorithm), and then using with collation support conversion algorithm, and adds generated value to key_list_to_hash vector, for hash generation later. Since the collation writeset is bigger or equal than the raw one, we do generate first the collation and reuse the buffer without the need to resize for the raw. * / KEY_PART_INFO Field for (int collation_conversion_algorithm= COLLATION_CONVERSION_ALGORITHM; collation_conversion_algorithm > = 0; collation_conversion_algorithm--) / / School team and non-school team algorithms, that is, MySQL field format and string format {pke.clear (); pke.append (table- > key_ in [key _ number] .name) / / table- > key_info [0] $15 = 0x7fffd8003631 "PRIMARY" pke.append (pke_schema_table); / / write the above obtained string so here is the primary key "primary + dbname+ separator + dbname length + delimiter + tablename+ delimiter + tablename length" uint I = 0; for (/ * empty*/; I)

< table->

Key_ in [key _ number]. User _ defined_key_parts; iTunes +) / / start scanning each corresponding field {/ / read the primary key field values in str. Int index= table- > key_ info [key _ number]. Key _ part [I] .fieldnr; / / TABLE st_key KEY_PART_INFO field in the corresponding position in the table size_t length= 0; / * Ignore if the value is NULL. * / if (table- > field [index-1]-> is_null ()) / / Field * * field; / * Pointer to fields * / * point-> [* field,*field,*field...] There are polymorphisms here and each field type has its own algorithm break; / / return / / convert using collation support conversion algorithm if (COLLATION_CONVERSION_ALGORITHM = = collation_conversion_algorithm) / if the field is empty or the value is empty / / if the school team algorithm {const CHARSET_INFO* cs= table- > field [index-1]-> charset () is adopted. Length= cs- > coll- > strnxfrmlen (cs, table- > field [index-1]-> pack_length ()); / / get the length key value} / / convert using without collation support algorithm else {table- > field [index-1]-> val_str (& row_data); length= row_data.length () } if (pk_value_size

< length+1) { pk_value_size= length+1; pk_value= (char*) my_realloc(key_memory_write_set_extraction, pk_value, pk_value_size, MYF(MY_ZEROFILL)); } // convert using collation support conversion algorithm if (COLLATION_CONVERSION_ALGORITHM == collation_conversion_algorithm) { /* convert to normalized string and store so that it can be sorted using binary comparison functions like memcmp. */ table->

Field [index-1]-> make_sort_key ((uchar*) pk_value, length); / / store the value of the field in pk_value. All types have make_sort_key function pk_ value [length] = 0;} / / convert using without collation support algorithm else {strmake (pk_value, row_data.c_ptr_safe (), length) } pke.append (pk_value, length); / / count the primary key value in pke.append (HASH_STRING_SEPARATOR); / / the delimiter value_length= my_safe_itoa (10, length, & value_length_ buffer [value _ LENGTH_BUFFER_SIZE-1]) / / the length in character form is returned as the char pointer'1''3' represents the length 13 pke.append (value_length); / / takes into account the length} / * If any part of the key is NULL, ignore adding it to hash keys. NULL cannot conflict with any value. Eg: create table T1 (i int primary key not null, j int, k int, unique key (j, k)); insert into T1 values (1,2, NULL); insert into T1 values (2,2, NULL); = > this is allowed. * / if (I = = table- > key_ info [key _ number] .user _ defined_key_parts) / / if all index fields are scanned {/ / the resulting string is a non-unique index name + delimiter + library name + delimiter + library name length + table name + delimiter + table name length + index field 1 numeric + delimiter + index field 1 length [ + Index field 2 numeric + delimiter + index field 2 length.] Generate_hash_pke (pke, collation_conversion_algorithm, thd); / / A pair of pke memory space to do HASH writeset_hashes_added++; # ifndef DBUG_OFF write_sets.push_back (pke); / / write to writeset and add to the write collection # endif} "what is the MySQL write set" is introduced here, thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.