Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Exclusive | decryption of Ant Financial Services Group TRaaS technical risk prevention and control platform

2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

The ant said:

In the financial industry, the importance of risk prevention and control ability is self-evident. Ant Financial Services Group can achieve up to 99.999% of live disaster recovery in different places, and the ability of 100 billion-level funds to check "accounts, certificates, and truthfulness" in real time is also obvious to the industry.

At this year's Ant Financial Services Group ATEC Technology Conference in Hangzhou, Ant Financial Services Group officially launched the technology risk prevention and control platform TRaaS (Technological Risk-defense as a Service). TRaaS, which has experienced numerous tests, is an immune system that combines Ant Financial Services Group's entire distributed architecture and corresponding technical risk capabilities. It combines high availability and financial security capabilities with AIOps, making the system self-healing and immune.

This article will give you a comprehensive interpretation of Ant Financial Services Group's technical risk prevention and control platform TRaaS.

TRaaS, which belongs to the "aaS" generation with IaaS, PaaS and SaaS, sounds like a high-tech sense. the Chinese definition of this blockbuster term at this year's Ant Financial Services Group ATEC Science and Technology Conference is very intriguing. Technological Risk-defense as a Service, technology risk control is service. According to Ant Financial Services Group, organizer of the ATEC conference, TRaaS is a "technical risk prevention and control platform" launched by Ant Financial Services Group for a long time, which combines high availability and financial security capabilities with AIOps, enabling the system to actively discover risks and self-recovery, thus forming a more intelligent and refined technical risk prevention and control system.

In fact, Ant Financial Services Group has been brewing TRaaS for several years, and it was only after four years of meticulous polishing and countless severe tests that Ant Financial Services Group announced it to the public at this ATEC conference, which shows the importance of TRaaS to Ant Financial Services Group. Therefore, the author feels that it is necessary to make an in-depth analysis and interpretation of TRaaS.

The past Life and present Life of 1.TRaaS

"Ant has a team who quietly guard our system and make 12 points of efforts to ensure business continuity. They are the ant's technical risk team. As we all know, the ability of risk prevention and control is the first for the financial system, how to ensure the high availability of the financial system? How to protect the zero capital loss of financial business? These are the two major issues that the technical risk team needs to solve, with the rapid development of the financial industry, these two major topics gradually developed into the field of technical risk prevention and control, including capacity elastic management and control, change risk prediction, capital risk identification and fault intelligent decision-making, and finally grew into a more intelligent and refined technical risk center architecture. At present, we have opened up the most mature technical risk prevention and control products in the ant financial cloud, which are widely used within ants. We will continue to explore the direction of AIOps operation and maintenance, and support 7 × 24-hour intelligent operation and maintenance through data, algorithms and experience precipitation. Therefore, we continue to abstract the ability of technological risk to form a TRaaS, which is exported to financial institutions undergoing digital transformation, so as to improve the ability of technical risk prevention and control while transforming and upgrading the distributed architecture, so as to truly make the uncertain things certain. " Ant Financial Services Group told the story of TRaaS in this way.

Just as mentioned above, TRaaS was born in the practical experience of Alipay system, and it is a technical risk prevention and control platform that has experienced many tests such as "Singles Day" and has gradually grown up step by step.

In 2015, after the tragic 527 incident on Alipay, Ant Financial Services Group learned from the bitter experience and set up a technical risk SRE team to be responsible for risk prevention and control of Ant Financial Services Group's entire financial system. In this year, ants completed the construction of a financial security prevention and control system, realized a disaster recovery framework with multiple activities in different places, and established a disaster recovery exercise mechanism.

In 2016, Ant Financial Services Group established the High availability & Capital Security Architecture Group, which is also the strong technical support team behind TRaaS's ability to provide users with high availability and financial security in the future. In the same year, Ant Financial Services Group began to conduct an off-network raid exercise to build an adaptive disaster recovery architecture, which also laid a solid foundation for the business continuity and high availability of TRaaS. In the same year, Alipay fund business check was promoted to real-time from Tweeh.

In 2017, Alipay achieved fine fault location, which provides a prerequisite for future fault self-healing, while the support of grayscale simulation further improves the robustness of the system. In the same year, Ant Financial Services Group also introduced red and blue attack and defense, which injected risk prediction ability into TRaaS.

In 2018, on the basis of fine fault location, Alipay system achieved fault self-healing, disaster recovery simulation regression provided the system with excellent disaster recovery capability, while AIOps integrated artificial intelligence technology into risk prevention and control.

It is on this basis that Ant Financial Services Group formally launched the TRaaS technology risk prevention and control platform at this year's Yunqi ATEC conference.

What is the strength of 2.TRaaS?

The TRaaS architecture, which we internally call the 'immune system', is like the human immune system. Just as the immune system helps people recover quickly when they are sick, we combine Ant Financial Services Group's entire distributed architecture with the corresponding technical risk capabilities to provide our immune system-TRaaS. Through TRaaS, we can guarantee 99.999% high availability, which depends on our three-location and five-center architecture. In addition, for the financial security of the most critical funds, TRaaS can achieve real-time internal accounts, certificates, real-time check, the speed of up to seconds. In addition, the most important thing is that TRaaS has a strong "immune ability", which enables us to find the risk in 5 minutes and heal in 5 minutes. " Ant Financial Services Group Deputy CTO, Vice President, Chief architect Hu Xi bluntly pointed out the three strengths of the TRaaS system, namely, high availability, financial security, and immunity.

One of the strengths: high availability up to 99.999%

The importance of high availability to the financial system is self-evident, but for Alipay, which can process up to 256000 transactions per second, with hundreds of millions of lines of code, tens of thousands of servers and a huge and dynamically changing system that may be composed of tens of billions of lines of code and millions of servers in the future, how to reasonably structure and manage its complexity Keeping it robust, agile and highly available is a great challenge for Ant Financial Services Group.

For this reason, Ant Financial Services Group provides omni-directional business continuity and high availability guarantee for the system through the independently developed financial distributed architecture SOFAStack and financial distributed database OceanBase. SOFAStack will provide full-stack financial distributed architecture capabilities, together with the financial distributed database OceanBase, to help agile iteration of business requirements while ensuring risk security, while meeting the needs of remote disaster recovery, low-cost and rapid expansion.

However, for the high availability of the financial system, disaster recovery is obviously more critical. Ant Financial Services Group launched a three-place and five-center architecture at this ATEC conference, that is, five computer rooms are deployed in three cities. once one or two of the computer rooms fail, Alipay's underlying technical system will switch all the traffic from the failed city to the normal computer room, and can keep the data consistent and zero loss. The three-place and five-center architecture can achieve low-cost transactions, unlimited scalability, recovery point objective (RPO) close to 0, recovery time objective (RTO) less than 30 seconds.

At the same time, through the full-link pressure test, that is, sufficient flow pressure, such as "Singles Day", will be loaded into the whole system to test the availability of the system under the limit capacity. and through the test results to constantly adjust and optimize the system.

Relying on the remote multi-active disaster recovery architecture of the three places and five centers and the test of full-link stress testing, TRaaS finally achieved high availability of up to 99.999%, that is, extremely high availability, that is, the annual downtime of the system will not exceed 5 minutes.

The second strong point: real-time check of hundreds of billions of funds in seconds.

The importance of capital security to the financial system is self-evident, especially for Alipay, which can handle tens of billions or hundreds of billions of funds per second, capital security is a matter of life and death. The essence of capital security is to ensure that there can be no errors in the amount of funds in the whole process of business transactions. This involves three major objects: people, applications and data, as well as five capabilities: fault emergency, data support, risk measurement, gray drill and risk identification.

The improvement of these capabilities can only be achieved through continuous attack and defense drills. therefore, since 2016, Ant Financial Services Group began to conduct off-network surprise drills, and in 2017, Ant Financial Services Group introduced red and blue attack and defense, and the frequency of drills gradually changed from once a month or two months to once a day now.

Through such continuous drills, Ant Financial Services Group's business verification capability has gradually changed from the initial Troup1 to the real-time business check today, and the entire capital security prevention and control system also includes change management and control, automatic return, flow simulation, capital security monitoring, emergency plans, and so on.

Chen Liang (nickname: Junyi), a researcher and head of the TRaaS platform, recalled that in the evolution of the capital prevention and control system, at first, like many banks, it relied on manpower to reconcile the current amount with the all-day ledger. After that, through the automatic way, the full database table is exported and calculated to check. Later, when the volume of business became larger, Trouh was introduced, and the check time changed from day to hour, and exception management was added in the process. Finally, when it evolves to real-time business checking, it adds the functions of circuit breaker decision, capital immunity and intelligent monitoring, thus forming TRaaS's powerful fund checking capability of hundreds of billions of seconds.

Strong point 3: 5 minutes found, 5 minutes of self-healing immunity

For a critical system, especially the financial system, it is almost impossible to know that there are problems in the system if the system does not take the initiative to have problems. Therefore, Ant Financial Services Group's response is to constantly inject various faults into the system every day. Moreover, these faults cover more than a thousand application scenarios of Ant Financial Services Group. This method of injecting faults is like a confrontation between the Red and Blue Army in military exercises. It is called "red and blue attack and defense". In the constant confrontation, the TRaaS system continues to become robust, thus forming its own so-called "immune ability".

The introduction of AIOps, namely intelligent operation and maintenance, can not only help TRaaS to be more intelligent in discovery, location and self-healing capabilities, but also greatly reduce the workload of operation and maintenance personnel. Chen Liang also cited several examples, such as in the highly available monitoring field, because there are so many points and data to be monitored, this will increase the noise of the data. The noise AI can be easily filtered out through a certain amount of training and pattern recognition, but it is very difficult for people. For example, inside Alipay, there are as many as 50,000 to 60,000 monitoring points alone, and each point can be configured with a piece of data. If people draw these data into a visual chart, this is an almost impossible task, while AI can easily do it. Chen Liang also said that identifying the correlation between monitoring data through AI can get twice the result with half the effort for capital monitoring, fault detection, accurate location, or risk prediction, rapid hemostasis and automatic decision-making.

What will happen to 3.TRaaS 's future?

As the most important release in the four years since the establishment of Ant Financial Services Group, TRaaS, distributed financial core suite and big data + artificial intelligence platform are actually the output of Ant Financial Services Group's technical and service capabilities accumulated and precipitated in the financial system for many years. It represents the open mind of Ant Financial Services Group 3.0 era, and also marks the highest level of Ant Financial Services Group's technology in the field of financial system. In fact, it has established Ant Financial Services Group's position as a technological leader in the financial field, and it is also an important part of the innovative financial services in the "five new" advocated by Jack Ma, chairman of Alibaba's board of directors. TRaaS is very important to Alipay, Ant Financial Services Group and even the whole Alibaba Group, so we have reason to believe that TRaaS will become Ant Financial Services Group's killer mace in the core financial system, and the journey ahead of it will be the stars and the sea!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report