How to quickly deal with a large amount of data in the database 10/19 Update SLTechnology News&Howtos

How to quickly deal with a large amount of data in the database

2025-10-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article focuses on "how to quickly deal with a large amount of data in the database", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to quickly deal with a large amount of data in the database.

Background

Merge hundreds of tables with the same data structure (represented by Tn) into one table (represented by C)

The distribution of data in T-table is very uneven, ranging from single digits to hundreds of thousands.

There is no business association between T tables.

C table structure adds several fields to T table structure, so INSERT INTO (SELECT * FROM) cannot be used.

The total amount of data is about 3 million. Through the single process test, the processing speed is about 500 seconds, and the estimated time consumption is about 100min.

target

Maximize the data processing speed and reduce the time-consuming to about 10min, when the writing speed of the C table is about 5000 seconds.

Solution evolution scenario one

Because there is no business association between T tables, each table can be processed separately.

The T table is sorted by the amount of data, and each process processes N tables to balance the load of each process as much as possible.

The existing problems: the data volume distribution of T table is extremely uneven, there are several tables with data volume of about 700000, the final time is about (700000 / 500s), the bottleneck problem is serious.

Option 2

On the basis of scheme 1, the problem of large table bottleneck can be solved by parallel processing with the dimension of table + data.

The problem: the code implementation is complex and needs to be considered

The amount of data per T table

Segmentation of T table with large amount of data

Avoid duplicate data processing

Option 3

With the help of the pub/sub mechanism of Redis, the separation of production and consumption is realized.

The production side is responsible for publishing the table name + ID of the T table to the same number of channel,channel and processes.

On the consumer side, each process subscribes to a different channel, reads the table name + ID, and writes the data corresponding to the table name + ID to the C table.

Option 4

Is a variant of scheme 3, with the help of Redis's List to achieve the separation of production and consumption.

The production side is responsible for writing the table name + ID of the T table to List.

The consumer reads the List and writes the data corresponding to the table name + ID to the C table.

Compared with the third scheme, the advantage of this scheme is that the code logic is relatively simple, and neither the production side nor the consumer side needs to do load balancing. Consumers can do more than they can, and multiple consumption processes finish their homework simultaneously.

Implementation details

Finally, plan 4 is adopted.

Production end

Read the T-table data in turn, and write the table name + ID to List. It should be noted that List supports batch writes, writing 100 pieces of data at a rate of about 50000 paces at a time.

Consumer end

The consumption speed of a single process is about 300amp s, and the processing speed can reach about 3000max s for 10 consumption processes. If the write speed of the database allows, you can appropriately increase the number of consumption processes.

At this point, I believe you have a deeper understanding of "how to quickly deal with a large amount of data in the database". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.