In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/03 Report--
1. MyCat basic concept 1. What is MyCat
A thoroughly open source large database cluster for enterprise application development
Enhanced database that supports transactions, ACID, and can replace MySQL
An enterprise database that can be regarded as a MySQL cluster to replace the expensive Oracle cluster
A new SQL Server that integrates memory cache technology, NoSQL technology and HDFS big data
A New Generation of Enterprise Database products combining traditional Database and New distributed data Warehouse
A novel database middleware product
2. Why use MyCat
Today, with the development of the Internet, the order of magnitude of data is also a support index growth, from GB to TB to PB. The operation of data is becoming more and more difficult, and the traditional relational database has been unable to meet the needs of fast query and insertion of data. At this time, the emergence of NoSQL temporarily solved the crisis. It can improve the performance by reducing the security of data, reducing the support for transactions, and reducing the support for complex queries. However, there are situations where some tradeoffs in NoSQL cannot meet usage scenarios (for example, some usage scenarios absolutely have transaction and security metrics. NoSQL must not be satisfied at this time. So you still need to use a relational database. How to use relational database to solve the problem of mass storage? At this point, it is necessary to do a database cluster. In order to improve the query performance, the data of one database will be scattered and stored in different databases. In order to deal with this problem, there is MyCat.
MyCat function:
It can satisfy the storage of a large number of database data, and improve the query performance.
Separation of reading and writing
Data split vertically (split database), horizontal split (split table), vertical + horizontal split (split table)
Multi-source integration
3. Comparison of database middleware
① Cobar (formerly known as amoaba) belongs to Ali B2B business group, began in 2008, served in Ali for more than 3 years, took over 3000 + MySQL databases of schema, and the cluster processes more than 5 billion online SQL requests per day. Due to the departure of the Cobar sponsor, Cobar stops maintenance.
② Mycat is the secondary development of the open source community based on Ali cobar, which solves the problems existing in cobar and adds many new functions to it. Green comes from blue and is better than blue.
③ OneProxy is developed using c based on MySQL's official proxy idea, and OneProxy is a commercial charge middleware. Give up some features and focus on performance and stability.
④ kingshard is developed by a small team in the go language and needs to be developed and perfected.
⑤ Vitess is used by Youtube production, and the architecture is very complex. MySQL native protocol is not supported, and it requires a lot of modification cost.
⑥ Atlas is rewritten by the team based on mysql proxy, the function needs to be improved, and it is unstable under high concurrency.
⑦ MaxScale is a middleware developed by mariadb (a version maintained by the original author of MySQL).
⑧ MySQLRoute is the middleware released by MySQL's official Oracle company.
4. Supported databases
Support some mainstream databases such as MySQL ORACLE SQLServer
5. Core technology
Database slicing: database slicing means that the data we store in a database is scattered in different databases (hosts) through certain conditions, so as to disperse the load of a single device. According to slicing rules, it can be divided into the following two slicing modes. MyCAT implements slicing by defining the slicing rules of the table, and each table can be bundled with a slicing rule. Each sharding rule specifies a shard field and binds a function to implement the dynamic slicing algorithm.
1) Schema: logical library, which corresponds to the Database (database) in MySQL. A logical library defines the included Table.
2) Table: a logical table, that is, a table stored in a physical database. Unlike a traditional database, the table here needs to declare the logical data node DataNode it stores. Here you can specify the sharding rules for the table.
3) the logical data node of the DataNode:MyCAT is the specific physical node where the table is stored, also known as the sharding node, which is associated to a specific database at the back end through DataSource.
4) DataSource: defines the access address of a physical library, which is used to bind to the Datanode
5) sharding rules: as mentioned earlier, when a large table is divided into several sharding tables, certain rules are needed, so the rule of dividing data into a certain sharding according to a certain business rule is a sharding rule. It is very important for data sharding to choose appropriate sharding rules, which will greatly avoid the difficulty of subsequent data processing.
2. MyCat installation and deployment environment: JDK: jdk must be version 1.7 or above MySQL: it is recommended that mysql is the official website of version 5.5 or above of MyCat: http://www.mycat.org.cn/ host operating system IP address mysql01 (master01) CentOS 7.3192.168.1.1mysql02 (slave) CentOS 7.3192.168.1.8mycatCentOS 7.3192.168.1.3PS:mysql uses off-the-shelf hosts (dual master + keepalived, master and slave are also possible). If you don't have mysql, look at a blog deployment. 1. Download and install, use the above official website address to download.
Copy link, wget download
[root@localhost ~] # wget http://dl.mycat.io/1.6-RELEASE/Mycat-server-1.6-RELEASE-20161028204710-linux.tar.gz
Decompress the compressed package. It is recommended that you put mycat under the / usr/local/mycat directory.
[root@localhost ~] # tar zxf Mycat-server-1.6-RELEASE-20161028204710-linux.tar.gz [root@localhost ~] # cd mycat/
Modify the configuration file:
Server.xml: define users and system-related variables such as ports, etc.
Schema.xml: define logical libraries, tables, sharding nodes, etc.
Rule.xml: define sharding rules
[root@mycat mycat] # vim conf/server.xml 80 / / define the user of connection middleware 81 123456 / / password 82 TESTDB / / logic library [root@mycat mycat] # cp conf/schema.xml conf/schema.xml.bak / / backup configuration file [root@mycat mycat] # vim conf/schema.xml / / specify logic library And the node / / specifies the real library select user () / / in the backend server to define the read server (slave) as well as the IP address and user password
# # unnecessary configuration items and comments have been deleted in this configuration file
Restart the mycat service
Let mycat have the mysql command
[root@mysql02 ~] # scp / usr/local/mysql/bin/mysql 192.168.1.3:/usr/local/sbin/
Two mysql authorizations to allow mycat to log in remotely
Mysql > grant all on *. * to root@'192.168.1.%' identified by '123.com'
MyCat Telnet Test:
[root@mycat mycat] # mysql-uroot-h 192.168.1.1-P 3306-p123.com [root@mycat mycat] # mysql-uroot-h 192.168.1.8-P 3306-p123.com
Start the mycat service:
[root@mycat mycat] #. / bin/mycat helpUsage:. / bin/mycat {console | start | stop | restart | status | dump} [root@mycat mycat] #. / bin/mycat console
Starting with console will occupy a terminal and view the contents of the log in real time (login, query, exit). Starting with start will not cause real-time logging and will not occupy the terminal.
[root@mycat ~] # ss-anplt | grep java
# mycat port 8806
Test to see if the data is synchronized:
[root@mycat] # mysql-umycat-p123456-h 192.168.1.3-P8066mysql > show databases
Mysql > use TESTDBmysql > show tables
Mysql > select * from tab1
III. MyCat read-write separation Mycat read-write separation is based on the master-slave replication of Mysql.
Modify the configuration file schema.xml
[root@mycat ~] # cd mycat/ [root@mycat mycat] # vim conf/schema.xml1, set balance= "1" and writeType= "0" Balance parameter settings: modified balance attribute, which is used to configure the type of load balancer type with read-write separation. Currently, there are four values: read-write separation mechanism is not enabled, and all read operations are sent to the currently available writeHost. balance= "1", all readHost and stand by writeHost participate in the load balancing of select statements. To put it simply, when the double master and double slave mode (M1-> S1, M2-> S2, and M1 and M2 are the master and standby of each other), under normal circumstances, M2M S1 and S2 participate in the load balancing of select statements. balance= "2", all read operations are randomly distributed on writeHost and readhost. balance= "3", all read requests are randomly distributed to readhost for execution, writerHost does not bear the read pressure WriteType parameter setting: writeType= "0", all write operations are sent to the available writeHost. writeType= "1", all write operations are randomly sent to readHost. writeType= "2", all write operations are randomly sent on writeHost and readhost scores.
"readHost belongs to writeHost, which means that it gets synchronous data from that writeHost, so when the writeHost to which it belongs goes down, it will no longer participate in the separation of read and write, that is, it is no longer working, because its data is" unreliable "at this time. Based on this consideration, in the current mycat versions 1.3 and 1.4, if you want to support the standard configuration of MySQL with one master and one slave, and the slave node can read data when the master node is down, you need to configure two writeHost and set banlance=1 in Mycat. "
2. There are three options for setting switchType= "2" and slaveThreshold= "100" switchType: -1: do not automatically switch 1: default, automatically switch 2: decide whether to switch based on the status of MySQL master-slave synchronization
"Mycat heartbeat check statement is configured as show slave status, and two new attributes are defined on dataHost: switchType=" 2 "and slaveThreshold=" 100 ", which means that the read-write separation and switching mechanism of MySQL master-slave replication state binding is enabled. Mycat heartbeat mechanism detects the" Seconds_Behind_Master "," Slave_IO_Running "and" Slave_SQL_Running "fields in the show slave status.
Determine the status of the current master-slave synchronization and Seconds_Behind_Master master-slave replication latency.
3. Modify the configuration file and start the program [root@mycat mycat] # vim conf/schema.xml
① console startup: execute. / mycat console under the mycat/bin directory
② background startup: under the mycat/bin directory. / mycat start
In order to see the startup log for the first time and easily locate the problem, select the ① console to start.
[root@mycat mycat] #. / bin/mycat console
Log in to the background management window this login method is used to manage and maintain Mycat
[root@mycat ~] # mysql-umycat-p123456-h 192.168.1.3-P80664, verify read-write separation my.cnf binlog_format=STATEMENT (1) insert data in the write host: [root@mysql02 ~] # mysql-uroot-p123.commysql > use test1;mysql > select * from tab1;+-+ | id | +-+ | 1 | | 2 | 3 | +-+ 3 rows in set (.01 sec) mysql > insert into tab1 values (4) Query OK, 1 row affected (0.02 sec) mysql > select * from tab1;+-+ | id | +-+ | 1 | | 2 | 3 | | 4 | +-+ 4 rows in set (0.00 sec)
The master and slave host data are inconsistent.
[root@mysql01 ~] # mysql-uroot-p123.commysql > use test1mysql > select * from tab1;+-+ | id | +-+ | 1 | | 2 | | 3 | +-+ 3 rows in set (0.00 sec) (2) query in Mycat: mysql > use TESTDBmysql > select * from tab1 +-+ | id | +-+ | 1 | | 2 | | 3 | | 4 | +-+ 4 rows in set (0.00 sec) 4. Vertical split-split library
A database consists of many tables, and each table corresponds to a different business. Vertical segmentation means that the tables are classified according to the business and distributed to different databases, so that the data or pressure is shared to different databases.
The principle of how to divide the table sublibrary: tables that are closely related should be in one library, and tables that are not related to each other can be divided into different libraries.
In this case, a new mysql is needed to divide the database.
Host operating system IP address mysql01 (master01) CentOS 7.3192.168.1.1mysql02 (slave) CentOS 7.3192.168.1.8mycatCentOS 7.3192.168.1.3mysql03 (master02) CentOS 7.3192.168.1.9
Delete the data inserted in the previous case first
[root@mysql02 ~] # mysql-uroot-p123.commysql > delete from tab1 where id=4;Query OK, 1 row affected (0.01 sec) mysql > select * from tab1;+-+ | id | +-+ | 1 | 2 | 3 | +-+ 3 rows in set (0.00 sec) [root@mycat ~] # mysql-umycat-p123456-h 192.168.1.3-P8066mysql > use TESTDBmysql > select * from tab1 +-+ | id | +-+ | 1 | | 2 | | 3 | +-+ 3 rows in set (0.00 sec)
The four tables are as follows:
The customer table is divided into one database, and the other three need to be associated with queries, which are divided into another database.
# customer form rows: 2 million create TABLE customer (id INT AUTO_INCREMENT,NAME VARCHAR, PRIMARY KEY (id)); # order form rows: 6 million create TABLE orders (id INT AUTO_INCREMENT,order_type INT,customer_id INT,amount DECIMAL (10Power2), PRIMARY KEY (id)); # order schedule rows: 6 million create TABLE orders_detail (id INT AUTO_INCREMENT,detail VARCHAR (2000), order_id INT,PRIMARY KEY (id)) # order status dictionary table rows:20CREATE TABLE dict_order_type (id INT AUTO_INCREMENT,order_type VARCHAR, PRIMARY KEY (id)); implementation sub-library: 1. Modify schema configuration file [root@mycat mycat] # vim conf/schema.xml
Select user () select user ()
Authorize MyCat to log in to mysql03 remotely
[root@mysql03 ~] # mysql-uroot-p123.commysql > grant all on *. * to root@'192.168.1.%' identified by '123.com'
Since the mysql03 corresponding to the logic library in the configuration file does not exist, it needs to be created in mysql03
Mysql > create database test1;2, restart mycat, visit MyCat, create table [root@mycat mycat] # mysql-umycat-p123456-h 192.168.1.3-P8066mysql > use TESTDBmysql > CREATE TABLE customer (- > id INT AUTO_INCREMENT,-> NAME VARCHAR,-> PRIMARY KEY (id)->) Query OK, 0 rows affected (0.01 sec) mysql > CREATE TABLE orders (- > id INT AUTO_INCREMENT,-> order_type INT,-> customer_id INT,-> amount DECIMAL (10L2),-> PRIMARY KEY (id)->) Query OK, 0 rows affected (0.01 sec) mysql > CREATE TABLE orders_detail (- > id INT AUTO_INCREMENT,-> detail VARCHAR (2000),-> order_id INT,-> PRIMARY KEY (id)->); Query OK, 0 rows affected (0.01 sec) mysql > CREATE TABLE dict_order_type (- > id INT AUTO_INCREMENT,-> order_type VARCHAR (200),-> PRIMARY KEY (id)->) Query OK, 0 rows affected mysql > show tables;+- + | Tables_in_test1 | +-+ | customer | | dict_order_type | | orders | | orders_detail | | tab1 | +-+ 5 rows in set (0.05 sec)
Use mysql01 to view the table:
Mysql > show tables;+- + | Tables_in_test1 | +-+ | dict_order_type | | orders | | orders_detail | | tab1 | +-+ 4 rows in set (0.00 sec)
Use mysql03 to view the table:
Mysql > use test1;mysql > show tables;+- + | Tables_in_test1 | +-+ | customer | +-+ 1 row in set (0.00 sec)
View structure
Mysql > desc customer +-+ | Field | Type | Null | Key | Default | Extra | +-+ -+-+ | id | int (11) | NO | PRI | NULL | auto_increment | | NAME | varchar | YES | | NULL | | + -+-+ 2 rows in set (0.00 sec) 5. Horizontal split-sub-table
As opposed to vertical split, horizontal split does not classify tables, but is dispersed into multiple libraries according to certain rules of a field, each table containing part of the data. To put it simply, we can understand the horizontal segmentation of data as the segmentation of data rows, that is, some rows in the table are split into one database, while other rows are split into other databases.
There is a bottleneck to select the number of data items stored in a single table MySQL to be split. When a single table reaches 10 million data, it will affect the query efficiency and need to be optimized by horizontal split (split table). For example, in the previous case, both orders and orders_detail have reached 6 million rows of data and need to be optimized for sub-tables. Sub-table fields take orders table as an example, which can be divided according to different fields.
Number sub-table field effect 1id (primary key, or creation time) query orders pay attention to timeliness, and historical orders are queried less often, so fragmentation will result in more visits to one node and less access to one node, resulting in inequality. 2customer_id (customer ID) is divided according to customer id, and the visits of two nodes are average. all orders of a customer are on the same node. 1. Modify the profile schema.xml [root@mycat mycat] # vim conf/schema.xml
Select user () select user () 2. Modify the configuration file rule.xml [root@mycat mycat] # vim conf/rule.xml
# add the sharding rule mod_rule to the rule configuration file, and specify the applicable field of the rule as customer_id
# and choose the slicing algorithm mod-long (modular operation on the field). Customer_id calculates the module of two nodes and splits according to the result.
38 39 40 customer_id 41 mod-long 42 43. 105 106 107 configuration algorithm mod-long parameter count is 2, two nodes 108 3, because there is no orders table on dn2 (mysql03:192.168.1.9), create an orders table on the data node dn2, and restart the mycat service [root@mysql03 ~] # mysql-uroot-p123.commysql > use test1 Mysql > CREATE TABLE orders (- > id INT AUTO_INCREMENT,-> order_type INT,-> customer_id INT,-> amount DECIMAL (10L2),-> PRIMARY KEY (id)->) Query OK, 0 rows affected (0.00 sec) 4. Insert data into the orders table in MyCat. The insert field cannot be omitted [root@mycat mycat] # mysql-umycat-p123456-P8066-h292.168.1.3mysql > use TESTDBmysql > INSERT INTO orders (id,order_type,customer_id,amount) VALUES. Query OK, 1 row affected (0.00 sec) mysql > INSERT INTO orders (id,order_type,customer_id,amount) VALUES; Query OK, 1 row affected (0.00 sec) mysql > INSERT INTO orders (id,order_type,customer_id,amount) VALUES; Query OK, 1 row affected (0.00 sec) mysql > INSERT INTO orders (id,order_type,customer_id,amount) VALUES Query OK, 1 row affected (0.00 sec) mysql > INSERT INTO orders (id,order_type,customer_id,amount) VALUES; Query OK, 1 row affected (0.01 sec)
# View orders table data in mycat, dn1 and dn2, and the sub-table is successful
Dn1 (mysql01:192.168.1.1):
[root@mysql01 ~] # mysql-uroot-p123.commysql > use test1;mysql > select * from orders +-+ | id | order_type | customer_id | amount | +-+ | 2 | 101 | 100 | 100300.00 | | 4 | | 103000.00 | 103000.00 | | 6 | 102 | 100 | 100020.00 | +-+ 3 rows in set (0.00 sec) |
Dn2 (mysql03:192.168.1.9):
[root@mysql03 ~] # mysql-uroot-p123.commysql > use test1;mysql > select * from orders +-+ | id | order_type | customer_id | amount | +-+ | 1 | 101 | 100 | 100100.00 | | 3 | | 101,101 | 120000.00 | | 5 | 102.101 | 100400.00 | +-+ 3 rows in set (0.00 sec) |
Create an orders_detail table in dn2 (mysql03:192.168.1.9)
Mysql > CREATE TABLE orders_detail (- > id INT AUTO_INCREMENT,-> detail VARCHAR (2000),-> order_id INT,-> PRIMARY KEY (id)->); Query OK, 0 rows affected (0.02 sec)
Restart Mycat access Mycat to insert data into the orders_detail table
[root@mycat mycat] # mysql-umycat-p123456-h 192.168.1.3-P8066mysql > use TESTDBmysql > insert into orders_detail (id,detail,order_id) values (1 row affected (0.03 sec) mysql > insert into orders_detail (id,detail,order_id) values); Query OK, 1 row affected (0.00 sec) mysql > insert into orders_detail (id,detail,order_id) values Query OK, 1 row affected (0. 01 sec) mysql > insert into orders_detail (id,detail,order_id) values; Query OK, 1 row affected (0. 01 sec) mysql > insert into orders_detail (id,detail,order_id) values; Query OK, 1 row affected (0 sec) mysql > insert into orders_detail (id,detail,order_id) values Query OK, 1 row affected (0.01 sec) mysql > select o.publication journal od.detail from orders as o inner join orders_detail as od on o.id=od.order_id +-+ | id | order_type | customer_id | amount | detail | + -+ | 1 | 100100.00 | 100100.00 | detail | 3 | 101 | 120000.00 | detail | 5 | 102 | 100400.00 | detail | 2 | 101 | 100300.00 | detail | 4 | 101 | 10100.00 | detail | 6 | 102 | 100 | 100020.00 | detail | + +-+ 6 rows in set (0.03 sec)
Use dn1, dn2 to view:
Mysql > select * from orders_detail
Mysql > select * from orders_detail
VI. Overall situation table
In the case of fragmentation, when the business table is sliced because of its size, the association between the business table and these affiliated dictionary tables becomes a thorny problem, considering that the dictionary table has the following characteristics:
Changes are not frequent the overall amount of data does not change big data is not large in scale, rarely has more than hundreds of thousands of records
In view of this, Mycat defines a special table, called the global table, which has the following characteristics:
The insert and update operations of the global table will be performed on all nodes in real time to maintain the data consistency of each shard. The query operation of the global table can only be obtained from one node. The JOIN operation can be carried out with any table to define the dictionary table or some tables in accordance with the characteristics of the dictionary table as the global table. On the other hand, the problem of data JOIN is well solved.
Through the global table and the fragmentation strategy based on Emurr relationship, Mycat can meet more than 80% of enterprise application development.
[root@mycat mycat] # vim conf/schema.xml
Create a dict_order_ type table in dn2 (mysql03:192.168.1.9)
Mysql > CREATE TABLE dict_order_type (- > id INT AUTO_INCREMENT,-> order_type VARCHAR,-> PRIMARY KEY (id)->); Query OK, 0 rows affected (0.01 sec)
Restart MyCat and access MyCat to insert data into the dict_order_type table
[root@mycat mycat] # mysql-umycat-p123456-h 192.168.1.3-P8066mysql > use TESTDBmysql > insert into dict_order_type (id,order_type) values (101 sec sec) mysql > insert into dict_order_type (id,order_type) values (102 sec) mysql > select * from dict_order_type +-+-+ | id | order_type | +-+-+ | 101 | type1 | | 102 | type2 | +-+-+ 2 rows in set (0.03 sec) commonly used sharding rules
Take the module: this rule is to find the operation of the fragment field. It is also the most commonly used rule for horizontal subtables. This rule is used in the orders table in the configuration sub-table.
Sharding enumeration: configure sharding by configuring possible enumeration id in the configuration file. This rule applies to specific scenarios. For example, some businesses need to be saved according to provinces or districts and counties, while the provinces, districts and counties in the country are fixed. This rule is used for such businesses.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.