Father of MySQL interprets the latest highlights of the database 07/01 Update SLTechnology News&Howtos

Father of MySQL interprets the latest highlights of the database

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Hello, everyone. I'm Michael Widenius of MariaDB. Today we're going to talk briefly about the new features of MariaDB10.5 and what we're going to do. 10.5 is already RC, and it should be GA next Thursday, so it's very close.

Monty Analysis of the New Features of MariaDB10.5 _ Tencent Video v.qq.com

Monty shares video in the whole process.

Since I personally added the feature to MariaDB, this is where I still write code, and I spend at least half my time doing it. In fact, I spent 90% of my time here during COVID-19, which is good.

One of the things I added to MariaDB10.5 is the S3 engine, which is where I was asked by users to provide a solution that will help users store historical data in cheap and reliable storage. S3 is a read-only engine.

Stored on S3 is optional compression to reduce a little bit of storage space. The important thing is that there is a variety of storage, but every place can run well, copy, all the keys. This is a very regular thing, you usually use it, but generally do not pay attention to it.

So, the table conversion here, you can directly Alter the table to S3 table, or change it back. You can have master and backup, all of them are different, this shows the normal replication work, the interesting thing is that S3 is shared storage, so that the master and slave can use the same S3. So when the Lord (the Lord of the MariaDB) changes, the standby can read the message directly without doing anything.

You can add server to MariaDB at any time. It is automatically discovered, so the only thing you need to do is slave plus authentication, and then you can access all the tables in S3, no need to copy, automatically perceive changes.

So far, we have done a good compatibility with MySQL in MariaDB, all commands, naming, etc., almost everything is the same. One thing that can be distinguished is their binary name. 10.4 is a soft link to MariaDB,10.5 changed to binary is prefixed with MariaDB, but there are still old soft links to ensure that old scripts and other things run. A big change is that in the process, you see MariaDB, not MySQL, which allows you to run MySQL and MariaDB at the same time, so you know who is who.

We have some extended DDL syntax, first of all, we added comments to the create database statement, we made it easier for table to use and rename, and we also made some code fixes on alter table and rename table. We need to copy table in strict mode, and the engine automatically adds if exist to binlog, so slave knows whether to skip it, because it may not exist as a share table.

We already have returning, but this time we add the returning syntax of insert and replace. Returning means that you can get the changed content back to the customer, so in insert it means that all the inserted data sets can be returned, which may be important, such as:

There are data columns and elements under replace, some will be deleted, and with retuning, you will know which ones are not actually entered by insert (because replace has the meaning of insert and replace, that is, you can tell which are real replace and which are insert).

We have also added syntax for ALL for Except and intersect (that is, we can de-divide the set operation without duplicating it), which is unique to MariaDB and MySQL does not. We have also done some work that makes it easy for MySQL8.0 to migrate to MariaDB, as you may know, we have different ways to store data, mysql has done some strange work, so we are not compatible with MySQL8.0 at the block and disk level, but the SQL level is compatible, so it is easy for most applications to migrate to MariaDB.

But to upgrade, we have done a lot of work to make it easy for users to upgrade from other versions to 10.5, there is no need to change anything, automatic tool upgrade.

We have some users who are very sensitive to master and slave, so we changed the show master status directive to show binlog status.

We have added a new function that supports JSON to make the support for JSON more friendly.

Many people think that the superuser privileges of mariadb are too super and too difficult to control. So we split the super permissions into smaller subpermissions for people to use. This means that you can give a user permission to operate binlog and binlog separately, but you don't have to promote him to superuser. Of course, to use this feature, the user needs to make a small permission change to adapt, which I think is the only change that needs to be adapted by the user in version 10.5.

In our version, we have done a lot of iterations on InnoDB, we have three innodb developers, and they have done a lot of work to improve innodb performance and make it faster. At the same time, many parameters have been deleted because they are almost meaningless, and some parameters have been adjusted to their default values. For example, innodb_log_files_in_group, we know that changing this value to a high value will not achieve any performance improvement, so it is now set to 1 by default.

All the variables here are not recommended (only some of them are shown):

Innodb_checksums

Innodb_log_checksums

Innodb_locks_unsafe_for_binlog

Innodb_stats_sample_pages

Innodb_undo_logs

Innodb_rollback_segments

Innodb_log_optimize_ddl

However, it can still be used in the configuration file, and there will be a warning log in errorlog at startup. This is a good thing to clean up these configurations, and we believe it is easy to upgrade, so we have made changes here.

Now you can see the state of the innodb engine through innodb status, and there are a lot of performance improvements, and if you migrate from the old version of innodb, you will feel very strongly.

Here we will introduce some improvement points. A significant change is the new thread pool (not the connected one, but the background thread pool), where many threads were opened for LRU when it was started, even if it was not used. We now have a thread pool for general, which is created on demand and does not have to be automatically reduced.

If you are using NVDIMM persistent storage, we have optimized this storage so that users can write data directly to persistent storage to share the overhead of the file system. This has significantly improved some of the performance of mariadb, basically reducing the cost of sync to log files to zero. Mariadb is always up to date and compatible with the latest hardware. We also work with cpu vendors, such as ARM, to ensure that mariadb is well compatible with arm.

One of the complaints we have received about 10.4 is that our new features are based on mysql 5.7 performance shcema, which has now been fixed by mariadb, so the performance_ schema table can provide more information about mariadb. Compared with mysql5.7, we already have a good data interface to use memory, and now we also support the interface of mysql5.7.

When I use mysql, I am particularly concerned about performance. One of the reasons why MySQL is so popular is that it has better performance than other databases. We maintained this advantage in MariaDB 10.5. I wrote my own code for recording new binaries, and it is worth noting that the improved binaries are smaller and faster to process. I also made some improvements to the scope optimizer, removing some minor problems with the optimizer in version 10.4. At the same time, I also improved the optimizer to better match the overhead with different engines.

For example, for the MEMORY engine, even if the data is stored in memory, the access overhead is calculated as if the data were stored on the hard disk. This issue was resolved in version 10.5, and Mariadb knew that in-memory table processing would be faster and more accurate in calculating the cost of memory tables.

I have been emphasizing that compared with mysql, mariadb can connect to the server very quickly, and in SQL we can connect from the server to the client faster. In 10.5, we increased the connection speed by 25%. This is why many users do not use connection pooling to improve performance as they do in mysql, because database connections in mariadb themselves are fast.

Of course, in many scenarios where you don't need a password to access the database, we use special logic to handle this internal use, making database connections faster.

We have made many improvements to Galera in 10.5, one of the most important of which is that mariadb supports Global Transaction ID. So now Gerlera supports all the latest features of mariadb, which makes Gelera easier and more secure to use.

With regard to master-slave replication, I mentioned earlier that REPLCA has been supported as a synonym for SLAVE in SQL statements. At the same time, we have extended binlog's metadata to include new fields. In mariadb 10.5 and later, it is more convenient to add new data types.

But if your master contains different data types, we need to know more information, such as the original column information, to deal with more complex situations. The new metadata makes replication more secure than before. I also mentioned IF EXISTS, where we can use IF EXISTS in the S3 engine instead of writing and modifying queries. We have a logo to tell whether it contains an implicit EXISTS.

With regard to the optimizer, we do a lot of internal processing to make the optimizer more reliable. More reasonably, the anlyze statement parses more information, and optimizer_trace is faster than it used to be. We also use less disk space when we use file sorting. If you use varchar as the file index, this file index will be used for every line in the previous file sorting. In 10.5, we only store the data that is actually used, which makes VARCHAR,CHAR and BLOB faster in file sorting.

In addition to the above mentioned, we have a lot of small improvements. We updated the library of regular expressions to the latest version. Increased the length of keys in the Aria engine from 1000 bytes to 2000. This is important because the S3 storage engine is based on Aria code, and we have users who want to convert the engine to INNODB, and they have very long keys in S3, and our improvement is mainly based on this.

A lot of new parameters have been added in 10.5, because there are so many, I only put links here. Knowledge Base (http://) has the latest features of the documentation, you can go to Knowledge Base search 10.5, you can find everything in ppt.

10.5 there are a few more features we want to add before the release. Before GA, we would add the use of hash connections, in fact, in version 10.4 and the current version 10.5, we already support hash connections, but users need to set some parameters for it to take effect. We are also developing an optimizer engine to automatically detect when hash connections perform better than normal connections.

There are situations where hash joins always perform better than normal joins. For example, we automatically use hash joins in 10.5 when no indexes are available. If the user forgets to add the index, version 10.5 will be much faster than 10.4 because the system automatically uses a hash join.

The inventory engine has changed a lot in 10.5, and I didn't add much to this page, because the topic of inventory can be shared as a complete topic. Interestingly, in the storage engine, each column is stored separately as a separate binary table. Column storage engine is a distributed engine specially used for analytical query optimization, which can quickly analyze and process pb-level data.

In mariadb10.5, the storage engine is pluggable, and it has its own rpm installation package, which users can easily add and delete from the server. This also makes it easier for us to optimize and contribute to mariadb inventory because we don't need separate binary tables.

I don't know if you've followed our news, but about a year ago, mariadb acquired the storage engine Clustrix, a distributed storage engine that supports transactions. Interestingly, because of the distributed nature of clustrix, it can handle unlimited write requests.

For example, the user has a cluster and there are three storage nodes in the cluster. After the user adds three nodes, the write request will double. This feature will be released in the first version of SkySQL. SkySQL is a cloud database product under mariadb, and we are still deciding how to add this feature to the community version of mariadb. As far as I know, the current plan is that users will be able to use all the features of clustrix after paying, and it will take some time for the community version to release Xpand.

We are glad that Tencent has made a lot of code contributions to mariadb. One of the biggest differences between mariadb and mysql now is that mariadb interacts better with the community and incorporates changes and contributions to the code. We are glad that Tencent is one of the great code contributors, and we hope that Tencent can make more contributions to mariadb in the future.

Here are some features that Tencent has contributed to mariadb:

Compress the events in the binary log to make the binlog smaller. We have done column compression with Tencent, which is not supported by mysql. Some versions of mysql support similar parts, but there are restrictions. Tencent has made a great contribution to mariadb in terms of column compression.

For the InnoDB second plus field feature, we have also improved it with the same mode. Tencent released the version before mariadb DDL to support seconds plus fields, just as Tencent-related developers attended the mariadb meeting in China. After the meeting, we discussed with Tencent's R & D how Tencent implemented the second-level add feature, and discussed whether we can make some additions and improvements on this basis.

This part of the work is done in the version of mariadb 10.3-10.4, but in version 10.4, based on the previous part of the code, we have added a lot of second-level features. In many innodb, the operation of alter table is second-level, and you can add and delete columns without the user's awareness of the database offline.

A few weeks ago I received a lot of code contributions to add to mariadb 10.5. DROP TABLE FORCE is a feature I implemented myself, and my task is to make sure that this feature is added in some form. Some of the rest of the code contributions on this page are related to the features in version 10.5 that Marcel (mariadb developers) are working on, and we hope that this section will be successfully added to version 10.5. if not, it will be released in 10.6.

The delete large table optimization was implemented by the Marcel team before we received the code contribution, and they will refer to the contributed code to get the best features in the final release.

CVM Promotion-Tencent Cloud cloud.tencent.com

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.