Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Ant Financial Services Group's structure and practice of data quality Governance

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Abstract: with the theme of "New Force of Digital Finance (The New Force of Digital Finance)", Ant Financial Services Group ATEC City Summit will be held in Shanghai on January 4, 2019 as scheduled. At the financial intelligence sub-forum, Li Junhua, a senior data technology expert from Ant Financial Services Group's data platform Department, made a wonderful sharing on the theme of "Ant Financial Services Group's practice of data quality governance".

In the speech, Li Junhua introduced the immune system of Ant Financial Services Group's data architecture system-- data quality governance system. In addition, he also focused on the relevant contents of data quality implementation, as well as the data quality governance practice and practical challenges faced by ants.

Li Junhua Ant Financial Services Group Senior data Technology expert in data platform Department

This article will focus on the following three parts:

Overview of data Governance

Data quality governance challenges

Practice of data quality governance

I. Overview of data governance

In recent years, Ant Financial Services Group continues to upgrade the data architecture in order to solve the data physical island problem faced by ants. Today, ants and the base of the entire Alibaba Group are unified on the same platform, so that when the fifth generation data architecture system is upgraded, the overall threshold of one-stop R & D is lowered. and makes it easy for all Ant Financial Services Group engineers to play with data on the platform. Nowadays, the problem of data isolated island can be well solved in the data architecture of ants, but what we need to pay attention to in the data governance system is the logical isolated island.

Before discussing data governance, let's talk about the value of data. The previous situation is that when the data first needs to be processed by a special team, delete the worthless data, and be responsible for the online or offline data. However, the judgment of the value of data is also a very troublesome problem, most of the data will only be online, not offline, resulting in a large number of worthless data accumulation. Today, ants are not only concerned about offline worthless data, but also focus on maximizing the value of data assets. In terms of data value, ants have a complete set of data asset levels and easy-to-use models of data assets, so that they can drive themselves to make full use of data assets to create more value. But if the data is used, but the quality is very low, it will greatly reduce the value of the data asset.

Data quality generation analysis

Next, we will focus on Ant Financial Services Group's practical ideas and programs in the field of data governance quality, and share two cases with you. The following figure shows the full flow chart of abstract data extraction. When a business classmate makes a mistake when entering data, it will cause data quality problems, such as misfilling in the customer's industry information or typing a wrong word will cause data quality problems, and such problems are easy to occur. When developing data applications based on traditional database assets, they basically produce data from the data source, and then send the data after processing and analysis, that is, "come from the business, and finally return to the business". The present scheme is very different from that before. In the past, when doing data processing, the data collected from the data production was given out after processing, but now many data applications of ants will return to the data system after data processing. For example, there are many invisible scenarios in the calculation of sesame credit score, and these data will return to the system after processing, and there may be data quality problems in every link of the process.

II. Challenges of data quality governance

The business form of ants is shown on the left side of the following figure. Today, the ant's business scenario is no longer limited to statistical analysis, but behind the ant's sesame points, flowers, borrowings and "310" loans are supported and driven by data. Today, the business form of ants has become the integration of "technology + data + algorithms" to maximize value. At the same time, there are many challenges in data quality governance, which come from business, data and user aspects.

III. Practice of data quality governance

Thoughts on data quality management

Students engaged in financial business often have deep feelings, the life cycle of business in the Internet financial era has been shortened a lot, and changes are also very frequent, compared with the original pace of the bank appears to be very fast. In addition, at present, both Ant Financial Services Group and Alibaba are talking about "data business, business data", data and business develop and advance together, and have entered the deep water area of development. In the past few years, ants preferred "Taper 1" in their business, but now the original architecture is not enough to support ants' continued development and high timeliness in the future. At the same time, today's ants have a large amount of data, and the data business also drives the upgrading of the entire talent system of ants. Now, in addition to their own data algorithm research and development students, other technical students will also use data on the platform, these students may have a different understanding of the data, so it is very important to ensure the quality of data driven by data.

So how to achieve data quality governance? First of all, there needs to be a clear set of organizations, which is the soil for the continuous construction of corporate culture, and the construction of data quality governance culture must be a definite, organized and long-term thing. On the basis of organizational security and quality culture, ants also focus on R & D flow and data flow. In the financial sector, R & D flows are more tightly regulated and more stringent. For today's Internet finance, it also needs strong control, because the business form determines that the R & D cycle is very short. Now ants have strong control in the R & D flow, and hierarchical control is used on the one-stop data R & D platform. After the requirements are put forward, they will be graded and marked, and then go into different processes. In addition, the R & D flow also focuses on hierarchical management and control, defining levels on the same set of standards and leveling different R & D flows. For data flow, when an application is released to the production environment, most of the energy is spent on the data flow. It is necessary to collect the data from the production environment to the processing platform every day, then run the algorithm calculation, and then return the data to the production environment. Today, ants have done a lot of things and built a lot of capabilities on the data stream link. For the data stream, if the source is contaminated, and if it cannot be controlled downstream, then the lower the repair cost, the greater the cost.

Based on the above data quality governance ideas, Ant Financial Services Group has done a lot of interesting things. The whole system will be monitored when the data platform is running. If there is a data quality failure, it can be repaired in time. In addition, ants have done a lot of work in all aspects from R & D to production, because there are many students who carry out data research and development based on the platform, so they need to reduce the threshold of use as much as possible. For all data streams, four capabilities are mainly built, including perception ability, recognition ability, intelligence recovery ability and operation ability. The platform needs to be aware of the failure and data quality problems of the release task, and in addition, the platform needs to be able to identify potential risks because the corrupted data needs to be known in a very timely manner. When the risk is identified, it requires intellectual healing ability. "Intelligence" is used because the original data processing tasks are often offline and may belong to the peak of data production from the wee hours of the morning to around 8 o'clock in the morning. during this time, people will be involved in the quality assurance task. On the other hand, the intelligent healing ability hopes to cooperate with the data processing work through the AI algorithm, so that the perception ability overlays the algorithm ability and can self-heal the data infection. Finally, there is the operational ability, and the data quality will not be displayed in the foreground. If the data quality is good enough, it can be completely unaware, and users no longer have to worry about whether the data can be used or dare to use it. Therefore, data quality is also very important for operations. In fact, the problem of data quality not only belongs to R & D and not only belongs to business, but also needs the participation of all staff to solve it. This is the idea of data governance.

Ant data quality governance architecture

The following figure shows Ant Financial Services Group's data quality governance architecture. In the system layer, according to the specific ideas mentioned above, the R & D phase mainly focuses on the construction of data testing, release control and change management, with emphasis on the issue of change. the change of data is not only designed for the change management of the system layer, but also involves the interconnection of online systems. Nowadays, the change of online data source will also change the data operation, which is more likely to lead to the data quality problem of data operation. The online R & D part provides some relevant interfaces for the data operation system, which can inform users which changes on the line will affect the data operation. In terms of release control capabilities, ants have invested a lot of energy in research and development. At present, there are no full-time data testing students in Ant, basically all of them are full-stack engineers, so the control may not be very strong for research and development, but it has achieved a strong release management and control ability. all tests related to experience, specification, performance and quality will be carried out in this part.

In the production stage, it mainly focuses on three system capabilities: quality monitoring, emergency drills and quality management. The capability of quality monitoring and alarm system should be available in most data system architectures, and its function is similar to the braking function of a car, so it must exist. On the other hand, ants have done a very interesting thing-data attack and defense drills, in which engineers will create faults artificially, and then test whether the system can find faults and repair them effectively in a short period of time. This part is also the ability of ants to focus on building at present. In the part of quality management, regular inspection will be carried out after it is released to the production environment according to the level of different applications to analyze whether it will affect the data quality. In short, for the system layer of the data quality architecture, not only the original data is very important, but now it is combined with machine learning to automatically configure some related strategies.

Data quality governance scheme

The following figure shows Ant Financial Services Group's data quality plan before, during and after the event in practice. On the whole, it includes three stages: demand, research and development, and pre-release, and now ants can be controlled, simulated and grayscale in advance. In the matter, the monitoring problem is focused on construction, the emergence of problems is not terrible, but need to achieve independent discovery of problems. In order to make the defense ability stronger, the ant realized the active attack exercise, and it was through the attack and defense drill that helped the ant find many weaknesses of itself. In addition, it also provides a strong emergency response capability, and some events will trigger emergency plans. in this part, ensuring data quality is actually turning uncertain data risks into certainties. After the event, the data quality is also very important, and it needs to be audited and measured through effective indicators and control means, so as to find the imperfections on the whole link and continue to improve.

A case of data quality governance

Finally, I would like to share with you two cases of Ant Financial Services Group in data quality governance:

Case 1: in the release process under the ant data governance architecture, a release process with strong control is realized. Any script needs to be tested when it is submitted, then posted online, and tested again.

Case 2: data governance involves the whole link, and for the data versions on different links, data acquisition mainly transports data from one end to the other, and there is no processing process, so some faults can be artificially injected at this time. Analyze whether the data quality governance system can find problems and make changes, so this produces both "attack" and "defense". Data processing has another set of architecture, which involves logical processing, and more consideration needs to be given to what kind of fault to inject and what to face. Today, when ants really hit the ground in the data quality management system, a lot of energy has been invested in the attack and defense drill.

Click to read more and see more details

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report