Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What does data source management, relational database and table and column library distributed computing refer to respectively in MySQL?

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article will explain in detail what data source management and relational database sub-table and column library distributed computing refer to in MySQL. The content of the article is of high quality, so the editor will share it for you as a reference. I hope you will have some understanding of the relevant knowledge after reading this article.

1. The concept of data splitting 1. Scene description

With the development of business, the amount of data is getting larger and larger, and the business system is becoming more and more complex, the conceptual logic of splitting arises at the historic moment. The split at the data level mainly solves the problem that part of the table data is too large, resulting in long processing time, long-term occupation of links, and even a large number of disk IO problems, seriously affecting performance; business level split, mainly to solve complex business logic, high coupling between services, easy to cause avalanche effect, business database split, micro-service distributed, is also the mainstream direction of the current architecture.

2. Basic concepts

Zoning mode

Do the partition mode for the data table, all the data logically exists in a table, but the physical stack is not together, it will be stacked in different files according to certain rules. When querying data, the partition must be triggered according to the specified rules, so that the full table scan will not be done. There are too many uncontrollable factors and too much risk, and table partitioning is prohibited in general development rules.

Sub-table mode

The amount of data in a single table is too large, and in general, the data in a single table is controlled at 3 million. The general situation here refers to the number of fields, the type is not an extreme type, and there is no large number of table locking operations in the query. Beyond this order of magnitude, it is necessary to split the table operation and route the data to different tables based on a specific strategy. The table structure is the same and the table name follows the routing rules.

Sub-library mode

In the scenario of continuous upgrading and complication of the system, the business is difficult to manage, and a large amount of data affects the overall performance. At this time, you can consider business sub-libraries, large data scenarios sub-database sub-tables, reduce the degree of coupling between businesses, high concurrency big data resource occupation, to achieve database-level decoupling. At the architecture level, it can also be service-oriented management to ensure the high availability and high performance of services.

Common algorithm

Hash value remainder: distribute the data to different databases and tables according to the hash value remainder of the routing key

Hash segmentation: segment the data according to the hash value of the routing key to achieve dynamic data distribution

These two methods are not a problem under normal conditions, but once the database is saturated again and needs to be migrated, the impact is greater.

Second, relational sub-library 1. Basic logic of sub-library.

Based on a proxy layer (here using Sharding-Jdbc middleware), specify the sub-library strategy, find different databases according to the routing results, and perform data-related operations.

2. Data source management

Uniformly manage the data sources that need to be divided into libraries.

@ Configurationpublic class DataSourceConfig {/ / omit data source related configuration / * split library configuration * / @ Bean public DataSource dataSource (@ Autowired DruidDataSource dataZeroSource, @ Autowired DruidDataSource dataOneSource, @ Autowired DruidDataSource dataTwoSource) throws Exception {ShardingRuleConfiguration shardJdbcConfig = new ShardingRuleConfiguration (); shardJdbcConfig.getTableRuleConfigs () .add (getUserTableRule ()) ShardJdbcConfig.setDefaultDataSourceName ("ds_0"); Map dataMap = new LinkedHashMap (); dataMap.put ("ds_0", dataZeroSource); dataMap.put ("ds_1", dataOneSource); dataMap.put ("ds_2", dataTwoSource); Properties prop = new Properties (); return ShardingDataSourceFactory.createDataSource (dataMap, shardJdbcConfig, new HashMap (), prop) } / * Subtable configuration * / private static TableRuleConfiguration getUserTableRule () {TableRuleConfiguration result = new TableRuleConfiguration (); result.setLogicTable ("user_info"); result.setActualDataNodes ("ds_$ {1.. 2} .user _ info_$ {0.2}"); result.setDatabaseShardingStrategyConfig ("user_phone", new DataSourceAlg ()) Result.setTableShardingStrategyConfig (new StandardShardingStrategyConfiguration ("user_phone", new TableSignAlg (); return result;}} 3, specify routing policy

Route to the library

According to the value of the sub-library policy, which library is routed to based on the hash algorithm. Different has algorithms not only affect the operation of the library, but also affect the rules of data entry into the table, such as even and odd numbers, resulting in parity.

Public class DataSourceAlg implements PreciseShardingAlgorithm {private static Logger LOG = LoggerFactory.getLogger (DataSourceAlg.class); @ Override public String doSharding (Collection names, PreciseShardingValue value) {int hash = HashUtil.rsHash (String.valueOf (value.getValue (); String dataName = "ds_" + ((hash% 2) + 1); LOG.debug ("split algorithm Information: {}, {}, {}", names,value,dataName); return dataName;}}

Route to tabl

According to the configuration of the sub-table strategy, which table is routed to based on the hash algorithm.

Public class TableSignAlg implements PreciseShardingAlgorithm {private static Logger LOG = LoggerFactory.getLogger (TableSignAlg.class); @ Override public String doSharding (Collection names, PreciseShardingValue value) {int hash = HashUtil.rsHash (String.valueOf (value.getValue (); String tableName = "user_info_" + (hash% 3); LOG.debug ("partitioning algorithm Information: {}, {}, {}", names,value,tableName); return tableName;}}

The above is the core operation flow based on ShardingJdbc sub-library and sub-table.

Column library statistics 1. Column number data

When analyzing relatively large data, we usually choose to generate a large and wide table and store it in a column database. In order to ensure efficient execution, the data may be divided into different databases and tables with the same structure. Statistics of different tables are based on multithreading, and then the statistical results are combined.

Basic principle: multithreading concurrently to execute the statistics of different tables, and then collect statistics, relatively speaking, statistical operation is not difficult, but need to adapt to different types of statistics, such as percentage, total, grouping, etc., coding logic is relatively demanding.

2. Column data source

Based on ClickHouse data source, demonstrate the basic logic of case operation. The library tables are managed and configured here.

Core profile

Spring: datasource: type: com.alibaba.druid.pool.DruidDataSource # ClickHouse data 01 ch-data01: driverClassName: ru.yandex.clickhouse.ClickHouseDriver url: jdbc:clickhouse://127.0.0.1:8123/query_data01 tables: ch_table_01 Ch_table_02 # ClickHouse data 02 ch-data02: driverClassName: ru.yandex.clickhouse.ClickHouseDriver url: jdbc:clickhouse://127.0.0.1:8123/query_data02 tables: ch_table_01,ch_table_02

Core configuration class

@ Componentpublic class ChSourceConfig {public volatile Map chSourceMap = new HashMap (); public volatile Map connectionMap = new HashMap (); @ Value ("${spring.datasource.ch-data01.url}") private String dbUrl01; @ Value ("${spring.datasource.ch-data01.tables}") private String tables01; @ Value ("${spring.datasource.ch-data02.url}") private String dbUrl02 @ Value ("${spring.datasource.ch-data02.tables}") private String tables02; @ PostConstruct public void init () {try {Connection connection01 = getConnection (dbUrl01); if (connection01! = null) {chSourceMap.put (connection01.getCatalog (), tables01.split (",")); connectionMap.put (connection01.getCatalog (), connection01) } Connection connection02 = getConnection (dbUrl02); if (connection02! = null) {chSourceMap.put (connection02.getCatalog (), tables02.split (",")); connectionMap.put (connection02.getCatalog (), connection02);} catch (Exception e) {e.printStackTrace () }} private synchronized Connection getConnection (String jdbcUrl) {try {DriverManager.setLoginTimeout (10); return DriverManager.getConnection (jdbcUrl);} catch (Exception e) {e.printStackTrace ();} return null;}} 3, basic task class

Since it is based on multithreaded statistics, a thread task class is naturally needed, and the count statistical mode is demonstrated here. Outputs the statistical results of a single thread.

Public class CountTask implements Callable {private Connection connection; private String [] tableArray; public CountTask (Connection connection, String [] tableArray) {this.connection = connection; this.tableArray = tableArray;} @ Override public Integer call () throws Exception {Integer taskRes = 0; if (connection! = null) {Statement stmt = connection.createStatement () If (tableArray.length > 0) {for (String table:tableArray) {String sql = "SELECT COUNT (*) AS countRes FROM" + table; ResultSet resultSet = stmt.executeQuery (sql); if (resultSet.next ()) {Integer countRes = resultSet.getInt ("countRes") TaskRes = taskRes + countRes;} return taskRes;}} 4. Summary of thread results

Here, we mainly start the execution of the thread, and finally summarize the processing results of each thread.

@ RestControllerpublic class ChSourceController {@ Resource private ChSourceConfig chSourceConfig; @ GetMapping ("/ countTable") public String countTable () {Set keys = chSourceConfig.chSourceMap.keySet (); if (keys.size () > 0) {ExecutorService executor = Executors.newFixedThreadPool (keys.size ()); List countTasks = new ArrayList () For (String key:keys) {Connection connection = chSourceConfig.connectionMap.get (key); String [] tables = chSourceConfig.chSourceMap.get (key); CountTask countTask = new CountTask (connection,tables); countTasks.add (countTask);} List countList = Lists.newArrayList () Try {if (countTasks.size () > 0) {countList = executor.invokeAll (countTasks);}} catch (InterruptedException e) {e.printStackTrace ();} Integer sumCount = 0 For (Future count: countList) {try {Integer countRes = count.get (); sumCount= sumCount + countRes;} catch (Exception e) {e.printStackTrace ();}} return "sumCount=" + sumCount;} return "No Result";}} 5

Relational sublibraries, or column statistics, separate the data based on specific policies, and then route to the data, perform operations, or merge the data, or return the data directly.

Source code address GitHub address https://github.com/cicadasmile/data-manage-parentGitEE address https://gitee.com/cicadasmile/data-manage-parent

Recommended Reading: data Management

Serial number title 01 data source management: master-slave database dynamic routing, AOP mode read-write separation 02 data source management: based on JDBC schema, adaptation and management of dynamic data sources 03 data source management: dynamic authority verification, table structure and data migration process about what data source management and relational sub-database sub-table and column library distributed computing in MySQL refer to. I hope the above content can be of some help to you and learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report