What are the new features of MySQL 8.0 02/13 Update SLTechnology News&Howtos

What are the new features of MySQL 8.0

2026-02-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

Editor to share with you what are the new features of MySQL 8.0. I hope you will gain something after reading this article. Let's discuss it together.

Geographic Information system (GIS)

Version 8.0 provides support for terrain, including support for data source information of spatial frames of reference, SRS aware spatial data types, spatial indexes, and spatial functions. All in all, version 8.0 can understand the longitude and latitude information of the earth's surface, and can calculate the distance between any two points on the earth in any supported 5000 spatial reference frames.

MySQL 8.0 delivers geography support. This includes meta-data support for Spatial Reference System (SRS), as well as SRS aware spatial datatypes, spatial indexes, and spatial functions. In short, MySQL 8.0 understands latitude and longitude coordinates on the earth's surface and can, for example, correctly calculate the distances between two points on the earths surface in any of the about 5000 supported spatial reference systems.

Spatial frame of reference Spatial Reference System (SRS)

ST_SPATIAL_REFERENCE_SYSTEMS exists in the information schema view library and provides the name of the SRS coordinate system available for use.

Each SRS coordinate system has an SRID number. Version 8.0 supports more than 5, 000 coordinate systems in EPSG Geodetic Parameter Dataseset (including stereo and 2D plane earth models)

The ST_SPATIAL_REFERENCE_SYSTEMS information schema view provides information about available spatial reference systems for spatial data. This view is based on the SQL/MM (ISO/IEC 13249-3) standard.

Each spatial reference system is identified by an SRID number. MySQL 8.0 ships with about 5000 SRIDs from the EPSG Geodetic Parameter Dataset, covering georeferenced ellipsoids and 2d projections (i.e. All 2D spatial reference systems).

SRID Geographic data Type SRID aware spatial datatypes

The data type of the spatial class can be obtained directly from the definition of the SRS coordinate system, for example: using the SRID 4326 definition to build the table: CREATE TABLE T1 (g GEOMETRY SRID 4326);.

SRID is a data type suitable for geographic types. Only data from the same SRID will be inserted into the row. An error is reported when you try to insert data with the current SRID data type. Tables that do not define SRID numbers will accept all SRID numbered data.

Spatial datatypes can be attributed with the spatial reference system definition, for example with SRID 4326 like this: CREATE TABLE T1 (g GEOMETRY SRID 4326)

The SRID is here a SQL type modifier for the GEOMETRY datatype. Values inserted into a column with an SRID property must be in that SRID. Attempts to insert values with other SRIDs results in an exception condition being raised. Unmodified types, i.e., types with no SRID specification, will continue to accept all SRIDs, as before.

Version 8.0 adds the INFORMATION_SCHEMA.ST_GEOMETRY_COLUMNS view, which can display the data rows of all geographic information in the current instance and their corresponding SRS names, numbers, and geographic type names.

MySQL 8.0 adds

The INFORMATION_SCHEMA.ST_GEOMETRY_COLUMNS view as specified in SQL/MM Part 3, Sect. 19.2. This view will list all GEOMETRY columns in the MySQL instance and for each column it will list the standard SRS_NAME, SRS_ID, and GEOMETRY_TYPE_NAME.

SRID spatial index SRID aware spatial indexes

Spatial indexes can be created on spatial data types, and the columns that create spatial indexes must be non-empty, for example: CREATE TABLE T1 (g GEOMETRY SRID 4326 NOT NULL, SPATIAL INDEX (g))

Spatial indexes can be created on spatial datatypes. Columns in spatial indexes must be declared NOT NULL. For example like this: CREATE TABLE T1 (g GEOMETRY SRID 4326 NOT NULL, SPATIAL INDEX (g))

The column that creates the spatial index must have a SRID data identity for use by the optimizer, and if the spatial index is built on a column that does not have a SRID data identity, the waring information will be output.

Columns with a spatial index should have an SRID type modifier to allow the optimizer to use the index. If a spatial index is created on a column that doesn't have an SRID type modifier, a warning is issued.

SRID space function

8.0 adds parameters such as ST_Distance () and ST_Length () to determine whether the data is in SRS and calculates its spatial distance. So far, ST_Distance and other spatial relational functions such as ST_Within,ST_Intersects,ST_Contains,ST_Crosses support geographic computing. See SQL/MM Part 3 Spatial for its operational logic and behavior.

MySQL 8.0 extends spatial functions such as ST_Distance () and ST_Length () to detect that its parameters are in a geographic (ellipsoidal) SRS and to compute the distance on the ellipsoid. So far, ST_Distance and spatial relations such as ST_Within, ST_Intersects, ST_Contains, ST_Crosses, etc. Support geographic computations. The behavior of each ST function is as defined in SQL/MM Part 3 Spatial.

Character set Character Sets

Version 8.0 uses UTF8MB4 as the default character set by default. Compared to version 5.7, SQL performance (such as sorting UTF8MB4 strings) has been greatly improved. UTF8MB4 type is playing an important role in web coding. Setting it as the default data type will bring convenience to the vast majority of MySQL users.

MySQL 8.0 makes UTF8MB4 the default character set. SQL performance-such as sorting UTF8MB4 strings-has been improved by a factor of 20 in 8.0 as compared to 5.7. UTF8MB4 is the dominating character encoding for the web, and this move will make life easier for the vast majority of MySQL users.

The default character set changes from latin1 to utf8mb4, and the default collation rule changes from latin1_swedish_ci to utf8mb4_800_ci_ai.

The default character set has changed from latin1 to utf8mb4 and the default collation has changed from latin1_swedish_ci to utf8mb4_800_ci_ai.

Utf8mb4 has also become the default coding of libmysql, server command line tool, and server layer.

The changes in defaults applies to libmysql and server command tools as well as the server itself.

Utf8mb4 has also become the default coding of the MySQL testing framework.

The changes are also reflected in MTR tests, running with new default charset.

The weight and case of the sorting and proofreading rules are based on version 9.0.0 of Unicode published by the Unicode Committee in 16 years.

The collation weight and case mapping are based on Unicode 9.0.0, announced by the Unicode committee on Jun 21, 2016.

In previous versions of MySQL, the special case and collation rules of 21 languages in latin1 coding were introduced into utf8mb4 collation rules. For example, the Czech sorting and proofreading rule has become utf8mb4_cs_800_ai_ci.

The 21 language specific case insensitive collations available for latin1 (MySQL legacy) have been implemented forutf8mb4 collations, for example the Czech collation becomes utf8mb4_cs_800_ai_ci. See complete list in WL#9108. See blog post by Xing Zhang here.

Added support for special context and stress-sensitive sorting and proofreading rules. Version 8.0 supports all three levels of DUCET (Default Unicode Collation Entry Table) sorting and proofreading rules.

Added support for case and accent sensitive collations. MySQL 8.0 supports all 3 levels of collation weight defined by DUCET (Default Unicode Collation Entry Table). See blog post by Xing Zhang here.

Utf8mb4's utf8mb4_ja_0900_as_cs sort check rule supports three-level weight sorting for Japanese characters.

Japanese utf8mb4_ja_0900_as_cs collation for utf8mb4 which sorts characters by using three levels' weight. This gives the correct sorting order for Japanese. See blog post by Xing Zhang here.

There is additional Kana support for Japanese, and ks in utf8mb4_ja_0900_as_cs_ks indicates katakana distinction.

Japanese with additional kana sensitive feature, utf8mb4_ja_0900_as_cs_ks, where 'ks' stands for' kana sensitive'. See blog post by Xing Zhang here.

Change unpadding in all sort checking rules prior to Unicode 9.0.0 into padding characters, which helps to improve string consistency and performance. For example, treat the space at the end of a string as other characters. The previous sort check rule left the string as it was when dealing with this situation.

Changed all new collations, from Unicode 9.0.0 forward, to be NO PAD instead of PAD STRING, ie., treat spaces at the end of a string like any other character. This is done to improve consistency and performance. Older collations are left in place.

See also blog posts by Bernt Marius Johnsen here, here and here.

Data type Datatypes Bit-wise operation of binary data type

Version 8.0 extends the use of bit-wise operations (such as bit-wise AND, etc.) to all BINARY data types. Previously, only integer data was supported. If you force a Bit-wise operation on a binary data type, it will be implicitly converted to a 64-bit BITINT type, and several bits of data may be lost.

Since version 8.0, bit-wise operations can be used on BINARY and BLOB types without worrying about reduced accuracy.

MySQL 8.0 extends the bit-wise operations ('bit-wise AND', etc) to also work with [VAR] BINARY/ [TINY | MEDIUM | LONG] BLOB. Prior to 8.0 bit-wise operations were only supported for integers. If you used bit-wise operations on binaries the arguments were implicitly cast to BIGINT (64 bit) before the operation, thus possibly losing bits.

From 8.0 and onward bit-wise operations work for all BINARY and BLOB data types, casting arguments such that bits are not lost.

IPV6 operation

Version 8.0 improves the operability of IPv6 data by supporting Bit-wise operations on BINARY. Version 5.6 introduces the INET6_ATON () and INET6_NTOA () functions that support the translation of IPv6 addresses and 16-bit binary data to and from each other.

But until 8.0, we couldn't combine IPv6 conversion functions with bit-wise operations because of the problems in the previous paragraph. Because INET6_ATON () correctly returns 128bit's VARBINARY (16), if we want to compare an IPv6 address to a gateway address, we can now use the INET6_ATON (address) & INET6_ATON (network) operation.

MySQL 8.0 improves the usability of IPv6 manipulation supporting bit-wise operations on BINARY data types. In MySQL 5.6we introduced the INET6_ATON () and INET6_NTOA () functions which convert IPv6 addresses between text form like 'fe80::226:b9ff:fe77:eb17' and VARBINARY (16).

However, until now we could not combine these IPv6 functions with bit-wise operations since such operations would-wrongly-convert output to BIGINT. For example, if we have an IPv6 address and want to test it against a network mask, we can now use INET6_ATON (address) & INET6_ATON (network) because INET6_ATON () correctly returns the VARBINARY (16) datatype (128bits). See blog post by Catalin Besleaga here.

UUID operation

Version 8.0 improves the availability of UUID by adding three new functions (UUID_TO_BIN (), BIN_TO_UUID (), and IS_UUID ()). UUID_TO_BIN () can convert text in UUID format to VARBINARY (16), while BIN_TO_UUID (), in contrast, IS_UUID () is used to verify the validity of UUID. Once the UUID is stored as VARBINARY (16), you can use a practical index.

The UUID_TO_BIN () function can move the time-dependent bits in the original converted binary value (time-related when UUID is generated) to the beginning of the data, which is more friendly to the index and reduces random insertion in the B-tree, thus reducing insertion time.

MySQL 8.0 improves the usability of UUID manipulations by implementing three new SQL functions: UUID_TO_BIN (), BIN_TO_UUID (), and IS_UUID (). The first one converts from UUID formatted text to VARBINARY (16), the second one from VARBINARY (16) to UUID formatted text, and the last one checks the validity of an UUID formatted text. The UUID stored as a VARBINARY (16) can be indexed using functional indexes.

The functions UUID_TO_BIN () and UUID_TO_BIN () can also shuffle the time-related bits and move them at the beginning making it index friendly and avoiding the random inserts in the B-tree, this way reducing the insert time. The lack of such functionality has been mentioned as one of the drawbacks of using UUID's. See blog post by Catalin Besleaga here.

Fourth, consumption-sensitive model query optimizer will take care of data buffering

Version 8.0 automatically selects a query plan based on whether the data exists in memory. In previous versions, consumption-sensitive models always assumed that the data was on disk.

Because the consumption constant of querying memory data is different from that of querying hard disk data, the optimizer will choose a more optimized way to read data according to the location of the data.

MySQL 8.0 chooses query plans based on knowledge about whether data resides in-memory or on-disk. This happens automatically, as seen from the end user there is no configuration involved. Historically, the MySQL cost model has assumed data to reside on spinning disks.

The cost constants associated with looking up data in-memory and on-disk are now different, thus, the optimizer will choose more optimal access methods for the two cases, based on knowledge of the location of data. See blog post by 0.ystein Gr ø vlen here.

Query optimizer histogram

Histogram statistics are added in version 8.0. Users can generate data distribution statistics for a column in the table (usually non-indexed columns) based on the histogram, so that the optimizer can use this information to find a more optimized query plan.

The most common use scenario for histograms is to calculate the selectivity of fields.

MySQL 8.0 implements histogram statistics. With Histograms, the user can create statistics on the data distribution for a column in a table, typically done for non-indexed columns, which then will be used by the query optimizer in finding the optimal query plan.

The primary use case for histogram statistics is for calculating the selectivity (filter effect) of predicates of the form "COLUMN operator CONSTANT".

The ANALYZE TABLE syntax used to create histograms has been extended with two new clauses: UPDATE HISTOGRAM ON column [, column] [WITH n BUCKETS] and DROP HISTOGRAM ON column [, column].

The total number of histograms (buckets) is optional, with a default of 100. The statistics of the histogram are stored in the dictionary table column_statistics and can be used

Check it out by information_schema.COLUMN_STATISTICS. Due to the flexibility of the JSON data format, histograms are now stored as JSON objects.

According to the size of the table, the ANALYZE TABLE command will automatically determine whether the table should be sampled or not, and even create equal-frequency or equal-height histograms according to the distribution and total amount of data in the table.

The user creates a histogram by means of the ANALYZE TABLE syntax which has been extended to accept two new clauses: UPDATE HISTOGRAM ON column [, column] [WITH n BUCKETS] and DROP HISTOGRAM ON column [, column].

The number of buckets is optional, the default is 100. The histogram statistics are stored in the dictionary table "column_statistics" and accessible through the view information_schema.COLUMN_STATISTICS. The histogram is stored as a JSON object due to the flexibility of the JSON datatype.

ANALYZE TABLE will automatically decide whether to sample the base table or not, based on table size. It will also decide whether to build a singleton or an equi-height histogram based on the data distribution and the number of buckets specified. See blog post by Erik Fr ø seth here.

Regular expressions

Along with UTF8MB4's regular support, version 8.0 also adds new functions such as REGEXP_INSTR (), REGEXP_LIKE (), REGEXP_REPLACE (), and REGEXP_SUBSTR ().

In addition, regexp_stack_limit (default 8000000 bits) and regexp_time_limit (default 32 steps) parameters are added to the system to control the causality of regular expressions. REGEXP_REPLACE () is also a popular feature in the community.

MySQL 8.0 supports regular expressions for UTF8MB4 as well as new functions like REGEXP_INSTR (), REGEXP_LIKE (), REGEXP_REPLACE (), and REGEXP_SUBSTR ().

The system variables regexp_stack_limit (default 8000000 bytes) and regexp_time_limit (default 32 steps) have been added to control the execution. The REGEXP_REPLACE () function is one of the most requested features by the MySQL community, for example see feature request reported as BUG # 27389 by Hans Ginzel. See also blog posts by Martin Hansson here and Bernt Marius Johnsen here.

VI. Automation characteristics of operation and maintenance

Developers and operators are concerned about the operational nature of database instances, that is, reliability, availability, performance, security, observability, and manageability. We will introduce the reliability of InnoDB Cluster and MGR separately in a new article, and the following paragraphs will introduce other operational changes to the table in version 8.0.

Dev Ops care about operational aspects of the database, typically about reliability, availability, performance, security, observability, and manageability. High Availability comes with MySQL InnoDB Cluster and MySQL Group Replication which will be covered by a separate blog post. Here follows what 8.0 brings to the table in the other categories.

VII. Reliability

Version 8.0 increases reliability overall for the following reasons:

MySQL 8.0 increases the overall reliability of MySQL because:

Version 8.0 stores meta-information in InnoDB, a proven transactional storage engine. Tables such as user permissions and data dictionaries are now stored in InnoDB.

MySQL 8.0 stores its meta-data into InnoDB, a proven transactional storage engine. System tables such as Users and Privileges as well as Data Dictionary tables now reside in InnoDB.

Version 8.0 eliminates a hidden danger that can lead to inconsistency. In versions 5.7 and earlier, there were two data dictionaries at the service layer and engine layer, which could lead to failure of synchronization between data dictionaries in case of failure. In version 8.0, there is only one data dictionary.

MySQL 8.0 eliminates one source of potential inconsistency. In 5.7 and earlier versions there are essentially two data dictionaries, one for the Server layer and one for the InnoDB layer, and these can get out of sync in some crashing scenarios. In 8.0 there is only one data dictionary.

3.8.0 version has realized atomization, no fear of downtime of DDL. Depending on this feature, DDL statements are either fully executed or not executed at all. This is critical for the replication environment, otherwise it will lead to data drift between the master and slave because of the inconsistent table structure.

MySQL 8.0 ensures atomic, crash safe DDL. With this the user is guaranteed that any DDL statement will either be executed fully or not at all. This is particularly important in a replicated environment, otherwise there can be scenarios where masters and slaves (nodes) get out of sync, causing data-drift.

Based on the new transactional data dictionary, the reliability has been improved.

This work is done in the context of the new, transactional data dictionary. See blog posts by Staale Deraas here and here.

After reading this article, I believe you have a certain understanding of "what are the new features of MySQL 8.00.If you want to know more about it, you are welcome to follow the industry information channel. Thank you for reading!"

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.