Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of MySQL data Type and Storage Mechanism

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

Editor to share with you the example analysis of MySQL data types and storage mechanism, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

1.1 Overview of data types

The data type is a kind of field constraint, which limits what kind of data each field can store, how much data can be stored, the format that can be stored, and so on. MySQL/MariaDB has roughly five types of data types: shaping, floating point, string type, date-time type, and special ENUM and SET types.

The meaning, limitations, and related descriptions of these five data types are shown in the following figure:

For the number of bytes consumed by each data type, see the official mariadb manual.

1.2 Storage mechanism and mode of operation

The reason why the data type can limit the data storage length of the field is that the address space is strictly delimited in memory when the table is created, and the address space can store as many bytes of data as the length of the address space. Of course, this is a very rough concept, and more specific storage methods are described below.

There are two ways to limit the scope of data types: one is to strictly limit the space, the amount of space divided can only store as much data, and the excess data will be cut off; the other is to use extra bytes of bit bits to mark whether the bytes of an address space have stored data.

1.2.1 Storage of integers

This is mainly about the storage of integers, but there are too many things to consider about the storage of floating-point data types.

For integer data types, it is strictly space-limited, but it is different from characters, because 0 and 1 on the bit bits on each divided byte can be calculated directly, so its range is calculated based on the number of bit bits. A byte has 8 Bit bits, and these 8 bit bits can form 2 ^ 8 = 256numeric values. Similarly, 2 ^ 16 = 65536 numeric values. 4-byte int occupies 32bit, which can be expressed in the range of 0-2 ^ 32. In other words, a number between 0255takes only one byte, and a number between 256and 65535 takes up two bytes.

It is important to note that the integer data type in MySQL/mariadb can use the parameter M ~ M is a positive integer, such as INT (M), tinyint (M). This M represents the display length, such as int (4), which means that 4-digit integers will be displayed on output, and if the number of digits of the actual value is less than the width of the display value, the left side will be filled with spaces by default. When the number of digits of the result exceeds, the display result will not be affected. This feature is usually populated with 0 instead of spaces in conjunction with the zerofill attribute, but when zerofill is used, the column automatically becomes an unsigned field. For example:

CREATE TABLE test3 (id INT (2) ZEROFILL NOT NULL); INSERT INTO test3 VALUES (1), (2), (11), (11); SELECT id FROM test3;+-+ | id | +-+ | 01 | 02 | | 11 | 11 | +-+ 4 rows in set (0.00 sec)

The only thing to note is that the display width only affects the display effect, not storage, comparison, length calculation, and so on.

1.2.2 how character types are stored

Here is the main description of how char and varchar are stored and the difference.

The char type is often referred to as the "fixed-length string type". It strictly limits the length of space, but it limits the number of characters, not bytes, but in previous versions it limited the number of bytes. Therefore, char (M) strictly stores M characters, the insufficient part is filled with spaces, and the part beyond M characters is truncated directly.

Because the char type has the ability to "fill in spaces when it is short", in order to reflect the authenticity of the data, trailing spaces are automatically deleted when retrieving data from the address space. This is a special feature of char. Even trailing spaces that we manually store are considered to be automatically filled, so they are deleted during retrieval. That is to say, the results of name='gaoxiaofang' and name='gaoxiaofang' are the same in the where statement.

For example:

Create table test2 (a char (4) charset utf8mb4); insert into test2 values ('Congratulations'), ('Congratulations on your promotion'), ('hello'), (' he'); select concat (a memery x') from test2 +-+ | concat (aPersonx') | +-+ | Congratulations to you x | | hellx | | hex | +-+ 4 rows in set

As you can see from the above results, char (4) can only store four characters and remove trailing spaces.

Varchar, often referred to as a "variable length string type", stores data with extra bytes of bit bits to mark whether a byte has stored data. Each byte stored (not a character) takes up one bit bit to record, so one additional byte can mark a total of 256bytes and two additional bytes can mark 65536 bytes. But MySQL/mariadb limits the maximum storage capacity to 65536 bytes. This means that if it is a single-byte character, it can store up to 65536 characters, and if it is a multi-byte character, such as UTF8, each character occupies 3 bytes, it can store up to 65536and21845 utf8 characters.

Therefore, in addition to the space length occupied by the real data, varchar (M) has to calculate the Bit bit length of 1 or 2 bytes, that is, the actual space occupied by single-byte characters is 1 or 2 bytes, and the space occupied by multi-byte characters (such as 3 bytes) is 2 bytes or 3 bytes.

Because varchar storage requires additional bit bits to record each byte, shorter data is not automatically filled, so explicitly stored trailing spaces are also stored and marked on the bit bit, that is, trailing spaces are not deleted.

Like char (M), only two bytes of characters can be stored when varchar (2) is specified, and if it is exceeded, it is cut off.

With regard to char, varchar, and text string types, trailing spaces are not taken into account when comparing, but spaces are taken into account when doing like matching or regular matching, because the characters are precise when matching. For example:

Create table test4 (a char (4), b varchar (5)); insert into test4 values ('ab','ab'); select a='ab', b='ab', ahumb from test4 +-+ | a='ab'| b='ab'| aquib | +-+ | 1 | 1 | 1 | +-+ 1 row in setselect a like'ab 'from test4 +-+ | a like'ab'| +-+ | 0 | +-+ 1 row in set

Finally, it is important to note that when values are stored (or called into memory), numeric storage is more space-efficient than character or date-time types. Because integer values are stored directly through bit to calculate values, any integer between 0255is only one byte, and any integer between 65535 and 256is 2 bytes, and when it takes up 4 bytes, it can represent any one of billions of integers, which obviously saves more space than character storage when each character takes up one byte. For example, the value "100" takes up three bytes when stored as a character, while storing as a numeric type takes only one byte. Therefore, by default, the database treats values that are not enclosed in quotation marks as numeric, and if you explicitly want to store them as characters or date-time types, you should use quotation marks to avoid ambiguity.

1.2.3 date-time storage mode

Date and time data storage needs to be surrounded by quotation marks to avoid ambiguity with numeric data. The way to enter date and time is very loose, and the following ways are allowed: any allowed delimiter, a 4-digit year is recommended.

201101012011-01-0118: 40 202011 Universe 01 18-40-2020110101184020

1.2.4 ENUM data type

The ENUM data type is enumerated. Defined as ENUM ('value1','value2','value3',...), only one of the value or NULL can be inserted when inserting data into a field of this type, and will be truncated to empty data when inserting other values or null (that is,''). Case (which is converted to characters in ENUM) is ignored when stored, and trailing spaces are truncated.

Mysql > create table test6 (id int auto_increment primary key,name char (20), gender enum ('Mail','f')); mysql > insert into test6 (name,gender) values (' malongshuai','Mail'), ('gaoxiaofang','F'), (' wugui','x'), ('tuner',null), (' woniu',''); Query OK, 5 rows affectedRecords: 5 Duplicates: 0 Warnings: 2mysql > show warnings +-+ | Level | Code | Message | +- -+ | Warning | 1265 | Data truncated for column 'gender' at row 3 | | Warning | 1265 | Data truncated for column' gender' at row 5 | +-+ 2 rows in setmysql > select * from test6 +-+ | id | name | gender | +-- + | 1 | malongshuai | Mail | | 2 | gaoxiaofang | f | 3 | wugui | | 4 | tuner | NULL | | 5 | woniu | | +-+ 5 rows in set

Data of ENUM type is stored by index numeric value, compared to string type, it only needs 1 or 2 bytes to store. In theory, only one byte is needed when the number of value is less than 256, and 2 bytes of storage is used when there are more than 256but less than 65536. The MySQL/MariaDB limit is 65536 value. Of course, this is a theoretical limitation, and there are many factors to consider in actual storage. For example, NULL also occupies bit bits, so actual storage may take 2 bytes for 250 value.

Each value of the ENUM is numbered by the index number, and the value of the index is used to retrieve or manipulate the field. The index=2 of index=1,value2 of value1, and so on. Note, however, that there are two special index values: index=NULL with null value and index=0 with empty data.

For example, when the first and last ENUM ('axiomagentific NULL, index, NULL) insert'', 'bouncy meme', respectively, into this field, the index value of'', 'bouncy recorder', respectively, is 2, 1, 3, respectively. So the corresponding relationship between index number and value is:

IndexvalueNULLNULL0''0''1'a'2'b'3'c'

Use ENUM's index for data retrieval:

Mysql > select * from test6 where gender=2;+----+ | id | name | gender | +-+ | 2 | gaoxiaofang | f | +-+ 1 row in set

In particular, it is recommended that you do not use ENUM to store values, because sorting or retrieval or other operations are based on the index value as a condition, which is easy to be misunderstood. For example, the following is to store two values in ENUM, and then retrieve and sort them.

Mysql > create table test7 (id enum); mysql > insert into test7 values ('1'), ('2'), ('3'); # id=2 when searching, but the result is 1, because 2 of id=2 is the index value of enum, and the value of index=2 in enum is 1mysql > select * from test7 where id=2 +-when +-+ 1 row in set# sorts by id, mysql > select * from test7 order by id asc;+----+ is also sorted by index size | id | +-+ | 3 | 1 | 2 | +-+ 3 rows in set

Therefore, it is strongly recommended that you do not store values in ENUM, even floating-point values are prone to ambiguity.

1.2.5 SET data type

For SET types, similar to enum, case-insensitive, trailing spaces are deleted when stored, and null is also a valid value. But the difference is that you can combine multiple given values. For example, set ('axiomaginomagronomy, cantilever, etc.) can store' a recalcitrant', etc., and multiple members are separated by commas. Therefore, when using multiple members, a comma cannot appear in the value of the member itself. If the content to be stored is not in the set list, truncation is null.

The amount of space occupied by the SET data type is related to the number of SET members M, which is calculated as (Max7) / 8 rounding. So: 1-8 members occupy 1 byte

9-16 members occupy 2 bytes; 17-24 members occupy 3 bytes; 25-32 members occupy 4 bytes; 33-64 members occupy 8 bytes.

The MySQL/MariaDB limit is 64 members.

Duplicate members are ignored when storing data of the SET data type and are stored in the order in which they were enumerated. For example, the result of set ('brecronomy, brecorder, etc.), which stores' a recital, a recorder, etc., is all 'bmai a'.

Mysql > create table test8 (a set); mysql > insert into test8 values ('brecedicaa'), ('brecedareca`), (' bab'); Query OK, 3 rows affectedRecords: 3 Duplicates: 0 Warnings: 1mysql > select * from test8;+-+ | a | +-+ | bprit a | | bjue a | | +-+ 3 rows in set

Use find_in_set (set_value,set_column_name) to retrieve the row containing the specified set value set _ value. For example, retrieve the row that contains member b in the a field:

Mysql > select * from test8 where find_in_set ('baccalaureate rows in set a); +-+ | a | +-+ | bfurol a | | bfurol a | +-+ 2 beacon

1.3data type attribute: unsigned

The unsigned attribute is to make data of numeric types unsymbolized. Using the unsigned attribute will change the range of numeric data types, such as the signed range of the tinyint type is-128to127and the range is 0 to 255when using unsigned. Unsigned also restricts the column from inserting negative values.

Create table t (an int unsigned,b int unsigned); insert into t select 1 meme 2 match insert into t select-1 meme 2

In the above statement, an error will be reported when the second statement is executed ready to insert a negative number, indicating that it is out of range.

The use of unsigned does work in some cases, for example, the general ID primary key column does not allow negative numbers, which is equivalent to implementing a check constraint. But sometimes there are some unexpected problems in using unsigned: if you get a negative number in the numerical operation, you will get an error. For example, in the table t above, fields an and b are unsigned columns, and there is a row of axiom 1 and b 2.

Mysql > select * from row in set + | a | b | +-- +-+ | 1 | 2 | +-+-+ 1

At this point, there will be an error in calculating a unsigned b, not only that, but also as long as the column participates in the calculation and will get a negative number.

Mysql > select Amurb from tbot 1690-BIGINT UNSIGNED value is out of range in'(`test`.`t`.`a`-`test`.`t`.`b`) 'mysql > select Amur2 from ttransfer1690-BIGINT UNSIGNED value is out of range in' (`test`.`t`.`a`)'

There will be no effect if the result of the calculation is not negative.

Mysql > select 2 from 3 from + | 2 row in set | aqui3 | +-+-+ | 1 | 3 | +-+-+ 1

This is not the bug in MySQL/MariaDB, and unsigned in C has a similar problem. This problem can be solved by setting set sql_mode='no_unsigned_subtraction' in MySQL/MariaDB.

Therefore, it is personally recommended that you do not decorate the field with the unsigned attribute.

1.4 data type attribute: zerofill

After zerofill decorates the field, the insufficient field display will be filled with 0 instead of spaces, and unsigned will be automatically set when zerofill is enabled. Zerofill is generally used together only after the display width of the column is set. The display width of the column has been introduced above.

Mysql > create table T1 (id int (4) zerofill); mysql > select * from T1 + | id | +-+ | 0001 | | 0002 | | 0011 | | 83838 | +-+ 4 rows in set (0.00 sec)

Zerofill only decorates the display results and does not affect the stored data values.

The above is all the contents of the article "sample Analysis of MySQL data types and Storage mechanisms". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report