Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Differences between char and varchar

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

The difference between characters and bytes (refer to references)

In ① ASCII code, an English letter (regardless of case) occupies one byte of space, and a Chinese character occupies two bytes of space. A sequence of binary numbers, in a computer, as a digital unit, usually 8-bit binary numbers, converted to decimal. The minimum is 0 and the maximum is 255.

In ② UTF-8 coding, an English character is equal to one byte and a Chinese character (including traditional Chinese) is equal to three bytes.

In ③ Unicode coding, one English is equal to two bytes and one Chinese (including traditional Chinese) is equal to two bytes.

Symbols: English punctuation occupies one byte and Chinese punctuation occupies two bytes. For example: English full stop. Take up the size of 1 byte, full stop in Chinese. " The size of 2 bytes.

In ④ UTF-16 coding, the storage of an English alphabet character or a Chinese character requires 2 bytes (some Chinese characters in the Unicode extension need 4 bytes).

In ⑤ UTF-32 encoding, the storage of any character in the world requires 4 bytes.

Byte is a unit of measurement, indicating the amount of data. It is a unit of measurement used by computer information technology to measure storage capacity.

Characters refer to the words and symbols used in the computer, such as 1, 2, 3, A, B, C, ~! # ¥%. -* ()-- +, etc.

The difference when saving data

Char defines a fixed length with a length range of 0-255. when storing, if the number of characters does not reach the defined number of digits, it will be stored in the database later with space completion. Char can store up to 255characters, regardless of encoding.

Varchar is of variable length, with a length range of 0-65535. When storing, if the character does not reach the defined number of digits, it will not fill in the space. For varchar, you can theoretically store up to 65535 characters, and the varchar field stores the actual content separately outside the clustered index. The content begins with 1 to 2 bytes to represent the actual length (2 bytes are needed if the length exceeds 255), so the maximum length cannot exceed 65535 characters. When using utf-8 coding, you can save up to 21844 characters, under gbk coding, you can have up to 32766 characters, and Latin1 accounts for one byte, and you can store up to 65532 characters.

GBK coding calculation method:

If a table has only one varchar type:

32766 = (65535-1-2) / 2.

The reason for minus 1 is that the actual storage starts from the second byte.

The reason for minus 2 is that the 2 bytes of the varchar header represent the length

The reason for dividing 2 is that the character encoding is gbk

UTF8 calculation method:

32766 = (65535-1-2) / 3.

The reason for minus 1 is that the actual storage starts from the second byte.

The reason for minus 2 is that the 2 bytes of the varchar header represent the length

Practical examples:

Create table T11 (c int, c2 char (30), c3 varchar (21812)) charset=utf8

The reason for minus 1 is that the actual storage starts from the second byte.

The reason for minus 2 is that the 2 bytes of the varchar header represent the length

The reason for minus 4 is that c of type int occupies 4 bytes.

The reason for minus 303 is that char (30) occupies 90 bytes and the code is utf8.

Fetch data difference

The database removes spaces when fetching data from the char field, but when fetching data from the varchar field, the trailing space of the data is retained.

The impact of the storage engine on the selection of CHAR and VARCHAR:

For the MyISAM storage engine: it is best to use fixed-length data columns instead of variable-length data columns. This makes the entire table static, making data retrieval faster and trading space for time.

For InnoDB storage engine: use variable length data columns, because the storage format of InnoDB data table is not divided into fixed length and variable length, so using CHAR is not necessarily better than using VARCHAR, but because VARCHAR is stored according to the actual length, it saves more space, so it is better for disk Imax O and the total amount of data storage.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 252

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report