Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Which scheme is adopted by big data Zhong 12306 to solve the concurrency of peak and high traffic?

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Today, I would like to talk to you about which solution is adopted by big data in 12306 to solve the problem of high peak and high traffic. Many people may not know much about it. In order to make you understand better, the editor summed up the following contents for you. I hope you can get something from this article.

The news that 12306 has crashed again in recent days is all over the Internet, and some people are constantly spraying more than 12306 rubbish on the Internet. I just want to say here that it's the same as before I didn't know it. I didn't realize how powerful it was until I was educated by my brother.

So what kind of solution is adopted in 12306 to solve the concurrency problem of peak and high traffic?

12306 website chooses the reconstruction scheme of Pivotal GemFire distributed memory computing platform. according to the running data record of the system, only 10 X86 servers are used to realize the remaining ticket calculation and query ability of dozens of minicomputers, and the longest time of a single query is reduced from about 15 seconds to less than 0.2 seconds, which is shortened by more than 75 times.

12306 is one of the largest real-time trading systems in the world, comparable to Amazon.com. The website is under great pressure during the peak visits during the holidays, especially during the Spring Festival.

GemFire is part of the Pivotal enterprise big data PaaS platform. The enterprise big data PaaS platform of Pivotal company mainly has three levels: cloud infrastructure layer Cloud Fabric, big data infrastructure layer Data Fabric, application development infrastructure layer Application Fabric. GemFire belongs to big data infrastructure layer, in addition, Greenplum database also belongs to this layer; cloud infrastructure layer technology is Cloud Foundry; application development infrastructure layer technology is Spring Framework and RabbitMQ and so on.

Before 12306, the Unix minicomputer architecture was adopted, and the Linux/X86 server cluster architecture was transformed into Linux/X86 server cluster architecture using GemFire technology, which means spanning three generations at once. From minicomputers to large-memory X86 server clusters, it not only improves performance by an order of magnitude, but also costs much less.

Since March 2012, the Railway Corporation (formerly the Ministry of Railways) began to investigate and transform 12306. The Pivotal GemFire distributed memory computing platform 12306 was selected in June 2012. in the first phase, the remaining ticket query system, which is the main bottleneck of 12306, was modified. The code modification was completed in September and the system was put online. 2012 National Day, is also the peak period of online booking, you can significantly find that you can log in 12306, although it is still difficult to book tickets, but the remaining tickets are very fast. In October 2012, the second phase used GemFire to transform the order inquiry system (customers query their own order records). During the Spring Festival in 2013, during the peak period of online booking, you can significantly find that you can log in 12306, although it is still difficult to book tickets, but it is very fast to query the remaining tickets, and it is also very fast to inquire about your own bookings and place orders.

"the problem of concurrency with peak and high traffic has been solved through technical transformation, so that the 12306 system no longer crashes as easily as it did at the initial stage. Pivotal GemFire distributed cluster memory data technology plays a key role in the whole technological transformation.

According to statistics, during the Spring Festival travel peak in early 2012, 20 million people visited 12306 websites every day, with the highest number of hits per day reaching 1.4 billion. A large number of simultaneous influx of Internet access nearly paralyzed 12306. As the contractor of the 12306 Internet ticketing system, the Institute of Electronic Computing Technology of the Chinese Academy of Railway Sciences urgently needs to find a way to solve the problem.

According to the running data record of the system, after the technical transformation, only more than 10 X86 servers are used to realize the remaining ticket calculation and query ability of dozens of previous minicomputers, and the longest time of a single query is reduced from about 15 seconds to less than 0.2 seconds, which is shortened by more than 75 times. In the case of extremely high traffic concurrency during the Spring Festival transportation in 2012, the system is almost paralyzed. After the transformation, tens of thousands of concurrent queries per second are supported, and the throughput of 26000 queries per second during the peak period is reached, and the efficiency of the whole system is significantly improved. As shown in the image above.

In the transformation of the order query system, under the system operation mode before the transformation, it can only support the throughput of 300-400 queries per second, and the concurrent query with high traffic can only be realized by sub-database. After the transformation, the throughput of up to tens of thousands of queries per second can be achieved, and the query speed can be guaranteed at about 20 milliseconds.

The new technical architecture can be flexibly and dynamically expanded on demand, and when the concurrency increases, it can also be dealt with by dynamically adding X86 servers to maintain millisecond response time.

12306 to achieve such earth-shaking results, it is impossible to rely on technical tinkering, there must be a new way of thinking, can bring leveraged effect to the performance improvement. 12306 found that the GemFire distributed memory data platform is such a technology.

The technical principle of GemFire distributed memory data platform is shown in the figure above: through cloud computing platform virtualization technology, the memory of several X86 servers is concentrated to form a memory resource pool up to dozens of TB, and all data is loaded into memory for memory calculation. The calculation process itself does not need to read or write to the disk, but writes the data to the disk synchronously or asynchronously on a regular basis. GemFire stores multiple copies of data in a distributed cluster. If any machine fails, there is backup data on other machines, so there is usually no need to worry about data loss, and disk data is used as a backup. GemFire supports persistence of in-memory data into a variety of traditional relational databases, Hadoop libraries, and other file systems.

As we all know, the bottleneck of the current computing architecture is storage. The speed of processors doubles according to Moore's Law, while the growth rate of disk storage is very slow, resulting in a huge gap of 100000 times. This makes it easy to understand why GemFire can significantly improve system performance.

According to the relationship between computing and storage, we can divide the computing architecture into four generations:

The first generation, a single disk-based system: the computing process requires reading data from the disk. Minicomputers and mainframes are the best of them, and they maximize the performance of a single system.

The second generation, disk-based distributed cluster system: the computing process needs to read data from the disk, but through the distributed system to distribute the data to different server disks to improve the processing capacity of the whole system. At present, many large Internet and e-commerce companies adopt distributed cluster systems based on X86 servers, which rely on the massive deployment of X86 servers to solve the problem of high traffic concurrency.

The third generation, a single memory-based system: the entire database is placed in memory and the computing process does not need to read data from disk. The performance of the whole system depends on the performance of a single system. The traditional in-memory database is such a system, which can solve the problem of access speed for enterprise applications, but it is powerless in the face of massive data or massive concurrent access scalability.

The fourth generation, memory-based distributed cluster system: GemFire is such a system, and parallel computing is one of its key technologies, so it can linearly expand performance on the basis of memory computing by increasing the scale of server deployment.

After reading the above, do you have any further understanding of which solution is adopted by big data in 12306 to solve the concurrency of peak and high traffic? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report