Simple practice of Mycat Sub-Library and Table 07/13 Update SLTechnology News&Howtos

Simple practice of Mycat Sub-Library and Table

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

In the usage scenario of MySQL, read-write separation is only part of the solution. If you want to expand it, you will inevitably use sub-database and sub-table. Fortunately, it has been done in Mycat. I took the time to test it today, and it feels good.

About sub-database and sub-table

Of course, I also sorted out, sub-library sub-table of these contents, if divided into several strategies or stages, there are probably the following.

The first one at the top is to disassemble the table directly. For example, there are three test1,test2,test3 tables under the database db1, and what you see through the middleware is still the table test. If the data in it is split like this, how can you decompose the pressure to a certain extent? if the details come, it is similar to the routine of the partition table.

The next several categories are also constantly improving, disassembling the table test into multiple libraries and multiple servers, if you do read-write separation, the full set of disassembly improvement is still very great. From this point of view, database middleware does a lot of things between applications and databases, and there are many other factors that can be popular besides technical reasons.

Simulation of test environment for sub-library and sub-table

If the sub-library sub-table is to be tested on a server, and the architecture scheme is required to be comprehensive, whether it can be realized as a reference for determining the technical feasibility.

If you simulate the architecture of one master and two slaves, and the simulation services are distributed on three servers, such a solution requires the creation of nine instances, and three db on each instance need to be split separately.

The general configuration is as follows:

Master1: Port 33091

(M1) slave1: Port 33092

(M1) slave2: Port 33093

Master2: Port 33071

(m2) slave1: Port 33072

(m2) slave2: Port 33073

Master3: Port 33061

(m3) slave1: Port 33062

(m3) slave2: Port 33063

Draw a picture to illustrate. If there is a table under db1,db2,db3, you need to do sharding.

So that's what we need to simulate.

Solve several minor problems encountered by using Mycat

When using Mycat, I encountered several minor problems. I felt more representative and recorded it.

Question 1:

After using Mycat to connect to the database, if you do not switch to the specific database, use the database name. The [table name] method throws the following error, which shows that during the whole process, Mycat intercepts the SQL information and filters it, and the destination route cannot be found during the conversion. Of course, in practical use, the use of norms will certainly not have this problem.

Mysql > select * from db1.shard_auto

ERROR 1064 (HY000): find no Route:select * from db1.shard_auto

Question 2:

After configuring the sharding policy, the insert statement throws the following error, which is a basic requirement for syntax.

Mysql > insert into shard_mod_long values (1)

ERROR 1064 (HY000): partition table, insert must provide ColumnList

Question 3:

If the sharding policy is misconfigured, it is likely that the table access is normal, but there will be a problem with the DML, indicating that the data conflicts. As for how to configure sharding, I'll talk about it below.

Mysql > select * from shard_mod_long

Empty set (0.00 sec)

Mysql > insert into shard_mod_long (ID,name,shard_date) values

ERROR 1105 (HY000): Duplicate entry'1' for key 'PRIMARY'

Question 4:

If the sharding is misconfigured, multiple copies of redundant data are likely.

Look at the execution plan at a glance, and you can see through data_node that the data points to multiple target libraries.

Mysql > explain insert into shard_auto (ID,name,shard_date) values

+-+

| | DATA_NODE | SQL |

+-+

| | pxcNode11 | insert into shard_auto (ID,name,shard_date) values (1 Magneto originating date) |

| | pxcNode21 | insert into shard_auto (ID,name,shard_date) values (1 Magneto originating date) |

| | pxcNode31 | insert into shard_auto (ID,name,shard_date) values (1 Magneto originating date) |

+-+

If there is a certain demand for this kind of situation, it is quite good, but it is a pity to do sharding. The problem lies in the following table configuration.

You need to remove the attribute of type= "global" and let it sharding.

Sharding Strategy in Mycat

Mycat's slicing strategy is very rich, which is beyond their own expectations, and it is also a bright spot of Mycat.

The general rules for slicing are as follows, and there are some other sharding methods that are not all listed here:

(1) multipart enumeration: sharding-by-intfile

(2) Primary key range: auto-sharding-long

(3) consistent hash:sharding-by-murmur

(4) string hash parsing: sharding-by-stringhash

(5) slice by date (day): sharding-by-date

(6) split by monthly hours: sharding-by-hour

(7) Natural month fragmentation: sharding-by-month

Before we begin, we will create the following table to simulate several sharding scenarios, and the table name can be changed according to the requirements.

Create table shard_test (ID int primary key, name varchar (20), shard_date date)

Primary key range fragmentation

The primary key range fragmentation refers to the primary key value and distributes the database in different libraries according to the distribution of the primary key value. We now create the same table structure on the corresponding sharding node.

With regard to the policy of sharding, the rule.xml file needs to be modified.

The commonly used sharding strategy has been implemented in Mycat and can be customized if you want to implement it yourself. For example, the following rule is based on the primary key field ID to do sharding, the distribution algorithm is rang-long, referring to function rang-long, this function is implemented in the corresponding Java class.

Rang-long

Autopartition-long.txt

When the scope of the primary key is not fixed, can be customized according to demand, for example, according to 1 million units, or 1000-bit units, the file is the content of the autopartition-long.txt file defaults as follows, the template is divided into 3 fragments, if you want to customize more you need to continue to configure, currently this configuration can only carry 1.5 billion of the data volume, you can continue to expand customization according to the needs.

# range start-end, data node index

# Knights 1000, Manners 10000.

0-500M=0

500M-1000M=1

1000M-1500M=2

Insert some data to verify, we can view the execution plan to do basic verification, the configuration is correct, and the data flows to the table under the specified database according to the rules.

Mysql > explain insert into shard_auto (ID,name,shard_date) values

+-+

| | DATA_NODE | SQL |

+-+

| | pxcNode11 | insert into shard_auto (ID,name,shard_date) values (1 Magneto originating date) |

+-+

There is also a small way to see the effect of sharding. For example, I insert a very large value to make sure that it is not on the same shard as other data. If we run the query statement twice, the result will change a little.

The effect of sharing

Mysql > select * from shard_auto

+-+

+-+

| | 1 | aa | 2017-09-06 |

| | 2 | bb | 2017-09-06 |

| | 5000001 | aa | 2017-09-06 |

+-+

3 rows in set (0.00 sec)

After a pause, continue to run.

Mysql > select * from shard_auto

+-+

+-+

| | 5000001 | aa | 2017-09-06 |

| | 1 | aa | 2017-09-06 |

| | 2 | bb | 2017-09-06 |

+-+

3 rows in set (0.01sec)

Hash fragmentation

In fact, Hash sharding is particularly widely used in enterprises. I think one of the reasons is that through this way of data routing, the data obtained is basically controllable and directly related to the business. Many split methods are based on the mod method to distribute data evenly.

The policy of sharding is configured in rule.xml, or the default mod-long rule, which refers to the algorithm mod-long. Here, it is done according to the number of nodes of sharding. The default is 3.

Mod-long

three

For example, check the results of insert twice.

Mysql > explain insert into shard_mod_long (ID,name,shard_date) values

+-+

| | DATA_NODE | SQL |

+-+

| | pxcNode22 | insert into shard_mod_long (ID,name,shard_date) values |

+-+

Mysql > explain insert into shard_mod_long (ID,name,shard_date) values (5pm)

+-+

| | DATA_NODE | SQL |

+-+

| | pxcNode23 | insert into shard_mod_long (ID,name,shard_date) values (5pm) |

+-+

You can see that the data still follow the law of the nodes, evenly distributed.

As for the configuration of schema.xml, it is the core of the whole sub-library, so I simply give a configuration for reference.

Show slave status

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.