Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the problem of DynamoDB

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

How to solve the DynamoDB problem, many novices are not very clear, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.

DynamoDB is a NoSQL database service implemented by Amazon based on "Dynamo: Amazon's Highly Available Key-value Store". It can meet the seamless expansion of the database and ensure the persistence and high availability of the data. Developers do not have to worry about the maintenance, extension, performance and other issues of DynamoDB, it is completely managed by Amazon, developers can pay more attention to the architecture and business level.

The following mainly describes the challenges encountered by the author's team in the specific business, based on why these challenges are finally chosen to use Amazon DynamoDB, what problems have been encountered in practice and how to solve them. The technical details of Amazon DynamoDB will not be discussed in detail, nor will it cover all the features of Amazon DynamoDB.

Background and challenges

TalkingData Mobile Advertising effectiveness Monitoring Product (TalkingData Ad Tracking), as an advertising monitoring platform between advertisers and the media, needs to receive a large number of promotion sample information and actual effect information every day, and finally attribute the actual effect to the promotion sample.

For example, when we browse information through a news APP on our mobile phone, we will see advertisements interspersed in the information flow. Advertisements may be in the form of text, picture or video, and no matter which form of advertising they are, they can interact with users.

If the ad push is more accurate and happens to be the content of interest to the user, the user may click on the ad for more information. Once the advertisement is clicked, the monitoring platform will receive the click event triggered by the user. We call all the information carried by the click event as sample information, which may include the source of the advertisement clicked, the time of clicking the advertisement, and so on. Usually, users will be guided to do relevant operations after clicking on the advertisement, such as downloading the APP recommended by the advertisement. When the user downloads and opens the APP, the mobile advertising monitoring platform will receive the effect information from the APP. So far, for advertising, even to make a successful transformation.

DynamoDB practice: how to ensure high availability and real-time performance when the amount of data is huge and unpredictable?

Mobile advertising monitoring platform needs to receive a steady stream of sample information and effect information, and repeatedly, constantly real-time processing again and again. As far as the monitoring platform is concerned, it has a great responsibility, which can not be recorded more or less. If the conversion data is recorded more, advertisers need to pay more advertising fees to the media, and less media will lose money. This brings several major challenges to the platform:

Large amount of data: some media may take abnormal means to create fake samples in order to maximize profits, resulting in "false flow", so the advertising monitoring platform will not only receive real user samples, but also receive a large number of fake samples. affect normal monitoring and attribution. In the craziest times, our platform will receive 4 billion + click sample event requests in one day. You know, these click sample events are to be retained for subsequent attribution, and the validity period of the sample varies widely, from a minimum of 12 hours to a maximum of 90 days.

The amount of data is unpredictable: a series of promotions such as the push of advertisers and the bidding ranking of app stores will lead to the sudden inflow of a large number of sample data. In the face of these unpredictable traffic situations, we still need to ensure that the system is correct, stable and real-time.

Real-time processing: advertisers rely on the results of real-time processing on the advertising monitoring platform to adjust advertising promotion strategies. Therefore, the advertising monitoring platform needs to support real-time data processing in order to provide strong support for advertisers to optimize their promotion strategies more quickly. At the same time, the processing results of the advertising monitoring platform should also be transmitted back to the media and advertisers in real time. It can be seen that accuracy and real-time are the basic conditions that the system must meet.

Sample storage: the core function of our business is attribution. We need to make clear, for example, which sample of promotional activities resulted from the conversion effect of users downloading and opening APP-that is, step 7 in the figure above. When the user installs APP, the monitoring platform needs to find the promotional activity where the sample is located in step 1, which is a process of query matching. For the huge attribution sample data, the validity period is different, how should we store the samples to make the system attribution quickly and not affect the real-time results, which is also a great challenge.

Initial form

Before June 2017, our business processing service was deployed in the computer room, using Redis Version 2.8 to store all sample data. Redis uses multi-node partitions, each of which is deployed in a master-slave manner. At the beginning, we deployed multiple nodes in Redis, divided into multiple partitions, each with one master and one slave. At the beginning, there was no problem with this method, but with the longer validity period of samples set by users and the increase of monitoring samples, the number of nodes at that time was no longer enough to support business storage. If users monitor the number of promotions soaring, our system storage will collapse and our business will be paralyzed. So we expanded for the first time.

Because of the previous deployment method, we can only double the capacity of multiple nodes of Redis, all of which requires manual operation, and we try our best to protect the user's sample data during this period.

DynamoDB practice: how to ensure high availability and real-time performance when the amount of data is huge and unpredictable?

With the growth of monitoring measurements and the longer validity period set by users, this kind of deployment will become more and more overburdened, and it will have serious consequences when there is an unpredictable burst. Moreover, the way of manual expansion is error-prone, low timeliness, and the cost is doubled. At that time, due to the limited machine resources, not only the Redis needs to be expanded, but also a series of services and clusters of the advertising monitoring platform need to be expanded.

Resolve the challenge

After discussion and evaluation, we decided to migrate services such as sample processing to cloud processing, and reselect the storage method to Amazon DynamoDB, which can meet most of our business needs. After structural adjustment, the system looks like the following picture:

DynamoDB practice: how to ensure high availability and real-time performance when the amount of data is huge and unpredictable?

Response to a large amount of data and unpredictable: our platform needs to accept extension monitoring connection requests and persist for subsequent data attribution processing. Theoretically speaking, DynamoDB can store as much data as the advertisement monitoring data request in the system, and it only needs a table to store any amount of data. Don't worry about the expansion of DynamoDB, we don't know that the storage is expanding when the system is running. This is what Amazon officially claims to be a fully hosted, seamless extension.

High availability: Amazon DynamoDB provides extremely high availability as a storage service, with all data written to DynamoDB stored on solid state drives and automatically synchronized to multiple availability zones of AWS to achieve high data availability. This work is also entirely hosted by Amazon DynamoDB services, and consumers can focus on business architecture and coding.

Real-time processing: Amazon DynamoDB provides extremely high throughput performance and supports the configuration of any level of throughput in a second dimension. For applications that write more and read less, you can adjust the number of writes per second to 1000 or more, and reduce the number of reads per second to 10 or less. Throughput can be arbitrarily set by users. When setting throughput, you can not only adjust it at any time in the Web management background, but also dynamically adjust it through the client provided by DynamoDB. For example, if the writing capacity of the system is insufficient at run time, we can choose to manually adjust it to the Web management background or automatically adjust it by calling the client-side API in the code. The use of client-side dynamic adjustment will not only make the system have a higher shrinkage ability, but also ensure the real-time data processing. The system data flow will be adjusted dynamically when the system data flow becomes higher, and then the data flow will be dynamically adjusted when the data flow becomes lower. Compared with manual adjustment, dynamic adjustment is more flexible. Based on the above points, we think that Amazon DynamoDB can easily support the core business capabilities of the system. What needs to be done on the business side is to sort out the business logic and write the data to DynamoDB, and leave the rest to DynamoDB.

There are also:

TTL: we use the TTL feature provided by Amazon DynamoDB to manage data that has a lifecycle. TTL is a mechanism for setting a specific timestamp on the data in the table that will expire. Once the timestamp expires, DynamoDB will delete the expired data in the background, similar to the concept of TTL in Redis. With the ability of TTL, we reduce a lot of unnecessary business logic decisions, but also reduce the cost of storage.

Streaming: we do not enable streaming to capture tables in our business, but we think that DynamoDB streaming is a very good feature that notifies the relevant services / programs when the data stored in the DynamoDB table changes (add, modify, delete). For example, if we modify a field of a record, DynamoDB can capture the change in that field and write the results before and after the change into a stream record.

Experience is the best teacher

We always encounter some "pits" when using some open source frameworks or services, and these "pits" can also be understood as not well understanding and dealing with some of the rules for their use. DynamoDB, like all services, has its own usage rules. Here we mainly share the problems encountered in the actual use and the solutions.

Data offset

When creating a table in DynamoDB, you need to specify the primary key of the table, which is mainly for the uniqueness of the data, the ability to index quickly, and increase the degree of parallelism. There are two types of primary keys, "using partitioning key alone" as primary key and "using partitioning key + sorting key" as primary key, which can be understood as a combined primary key (index), which uniquely determines / retrieves a piece of data by two fields. At the bottom of DynamoDB, the data is partitioned according to the value of the primary key, so that the load can be balanced and the pressure of individual partitioning can be relieved. At the same time, DynamoDB will also try to partition the primary key value "reasonably".

We didn't do anything about the primary key value at first, because DynamoDB takes the partition key value as the input to the internal hash function, and its output determines that the data is stored in the specific partition. But as we run, we find that the data begins to show write offset, and it is very serious, resulting in a decline in the read and write performance of the DynamoDB table, the specific reasons will be discussed in detail later. After discovering such problems, we considered two solutions:

So we choose the second method, adjust the business code, hash the primary key value when writing, and hash the primary key condition when querying.

Automatic expansion latent rule

After resolving the data offset, read / write performance recovered, but after running for a period of time, read and write performance declined again. Query the data write is not offset, at that time, we improved the write performance to 60,000 + / s, but to no avail, the actual write speed is 20,000 + / s. Finally, it is found that we have too many partitions, and the number of partitions automatically maintained by DynamoDB in the background has reached 200, which seriously affects the read and write performance of the DynamoDB table.

DynamoDB automatically expands its capacity and supports arbitrary throughput, which is based on its two automatic expansion rules: single partition size limit and read-write performance limit.

Single partition size limit

DynamoDB automatically maintains data storage partitions, but the maximum size of each partition is 10GB. If this limit is exceeded, it will cause DynamoDB to split the partition. This is also the impact of data offset, when the data is seriously offset, DynamoDB will silently partition your offset partition. We can calculate the number of partitions according to the following formula:

Total data size / 10GB and round up = total number of partitions

For example, the total amount of data in the table is 15GB 15 / 10 = 1.5, rounded up = 2, and the number of partitions is 2. If the data is not offset and evenly distributed, then each of the two partitions stores 7.5GB data.

Read and write performance limit

Why does DynamoDB want to split the partition? Because it ensures the user's preset read / write performance. How can I guarantee it? Rely on keeping each partition data within 10G. Another condition is that DynamoDB expands the partition when the partition does not meet the preset throughput. DynamoDB defines the read and write capacity of each partition as follows:

"write capacity unit: write capacity unit (WCU:write capacity units), calculated as the maximum 1KB per piece of data, up to 1000 writes per second."

Read capacity unit: read capacity unit (RCU:read capacity units), calculated as the maximum 4KB of each piece of data, with a maximum of 3000 reads per second.

In other words, the maximum write capacity unit and read capacity unit of a partition are fixed, and the partition will be detached if it exceeds the maximum capacity unit of the partition. So we can calculate the number of partitions according to the following formula:

(default read capacity / 3000) + (default write capacity / 1000) rounded up = total number of partitions

For example, the default read capacity is 500, the write capacity is 5000, (3000 / 3000) + (5000 / 1000) = 5.1, rounded up = 6, and the number of partitions is 6.

It should be noted that the new partition with more than 10G split of a single partition shares the read and write capacity of the original partition, not the separate read and write capacity of each table.

Because the preset read and write capacity determines the number of partitions, but because the data volume of a single partition reaches the upper limit, two new partitions are detached.

Therefore, when the data offset is serious, the read and write performance will decline sharply.

Hot and cold data

The above problem arises from the fact that we started with a single table operation. In this way, even if the data is not offset, the amount of data increases over time, and more and more partitions are naturally detached.

Therefore, we have made reasonable disassembly of tables and set hot and cold data sheets according to our business. This has two major benefits:

Improve performance: according to the above rules, it is obvious that the amount of data in the heat meter will not continue to grow indefinitely, so the partition is also stable in a certain order of magnitude, ensuring read and write performance.

Cost reduction: needlessly increasing read and write performance for a single table not only has no obvious effect, but also costs will increase sharply, and the increase in use cost is unacceptable to everyone. DynamoDB storage is also costly, so you can store cold table data in S3 or other persistence services and delete DynamoDB tables, which is also a way to reduce costs.

Table limit

There is no limit to the size and amount of data in a table, and you can write data to a table without limit. However, for one AWS account, the limit for each DynamoDB use area is 256 tables. For a company, there may be a risk of table creation restrictions if the same account is shared. Therefore, if the hot and cold meter strategy is enabled, in addition to deleting cold meters to reduce costs, it is also a solution to the limit of 256 tables.

Attribute name length

The limit of maximum 1KB per piece of data per write unit and maximum 4KB per read unit is mentioned above. The size of a single piece of data not only takes up bytes for field values, but also for attribute names, so the attribute names in the table should be reduced as much as possible on the premise of ensuring readability.

There is also a cost to use DynamoDB, mainly reflected in the cost of writing and reading. We have developed a strategy to adjust the read and write limits in real time according to the actual traffic. With the development of DynamoDB, Auto Scaling has also been introduced, which enables custom policies to dynamically adjust the write and read limits, which saves developers a lot of research and development effort. At present, some of our businesses use Auto Scaling function, but due to the limitation of this function, the real-time performance of dynamic adjustment in actual use is slightly lacking.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report