Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the risk Control Business of Nebula Graph

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

Most people do not understand the knowledge points of this article "Nebula Graph how to solve risk control business", so the editor summarizes the following content, detailed content, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "Nebula Graph how to solve risk control business" article.

Business background

Different from the 10-day and half-month application review time for the credit business of traditional banks, the first feature of Internet financial lending is that the application review is very fast. Users may submit a credit application on the mobile phone one second, and the system will return the result of the credit application the next second. In addition, there is another feature of Internet financial lending: it is difficult to guarantee the authenticity of data information, and the information filled in by users: annual income, family relations, contacts will be false. These two characteristics of Internet finance give birth to an industry, that is, the network underground industry. Generally speaking, the network underground industry is the behavior of users to "borrow" wool. Because of the strong concealment of online lending, once the underground industry account has committed fraud, it is difficult to track specific people through the Internet. In addition, due to the timeliness of loan approval, underground industry accounts can easily move volume and get more money. Based on this, the demand for risk control of Internet finance requires systematic screening of fraud scenarios.

So how to identify the network underground industry? Through the relationship between users and different entities, devices, GPS and mobile phone number, and the community found that checking whether individuals in the community have fraud risk, conducting anti-fraud case investigation, can well control the loan risk. At present, the risk control of Zhongan Insurance is based on Nebula Graph.

Why choose Nebula Graph?

At the beginning of Zhongan insurance technology selection, team members investigated the products in the database market, and first screened out JanusGraph and OrientDB.

Let's start with JanusGraph. Within the Zhongan financial technical team, JanusGraph has a major advantage: team members are familiar with it, and many engineers have used JanusGraph, which to some extent reduces the cost of map database development and getting started. Developers who have used JanusGraph know that it is a distributed graph database, and storage and indexing rely on open source components, such as HBase (storage) and Elasticsearch (index). And before one of the company's business line has used JanusGraph, the bottom is equipped with online HBase storage services, and this business is relatively independent and other core business does not have a strong dependency. " Different countries have different national conditions, and once the same mechanism is forcefully moved to different countries, there may be problems of disobedience to soil and water. At present, the basic data of Zhongan insurance risk control business is stored in HBase. If the risk control system uses JanusGraph, importing tens of billions of map data into HBase will have an impact on HBase clusters, increase query burrs, and other business lines will be affected. In addition, JanusGraph imports are slow in terms of large-scale write speed performance. To sum up the above reasons, even though JanusGraph has a low start-up cost, it is strongly dependent on other components and poor import performance, so JanusGraph pass.

In the process of product research in the map database, we found that OrientDB ranked higher in DB-Engine and had perfect functions. After performance testing, it is found that using OrientDB in small datasets feels good, but once the Mock data exceeds 100 million, using OrientDB in large datasets will encounter frequent error reports on the Server side. After consulting the OrientDB official documents to no avail, Zhongan Insurance submitted the issue to the OrientDB official GitHub warehouse. However, the OrientDB feedback response is slow. In the process of submitting the issue, we also found that the community users who frequently reported errors on the Server side of the large-scale dataset were submitted two years ago, and the issue is still in a state of open. In addition, in terms of large-scale data writing performance, the speed of the write point is acceptable, but the QPS of the write side is only 1-2k, so it will take time to start graph data modeling at this speed, which is unacceptable. To sum up, although OrientDB ranks high and has perfect functions, we did not choose OrientDB because of frequent Server errors in large-scale data, slow community issue response, and poor large-scale writing speed.

The opportunity for Nebula Graph to participate in technology selection is to consult other companies (JD.com, Ctrip) when Zhongan Insurance begins to select the database. When employees use the graph database of the company, they unanimously recommend Nebula Graph. Therefore, Nebula Graph has become one of the options for the selection of Zhongan insurance map database. In the actual test, we find that the large-scale write speed of Nebula Graph is very fast, and the test data of production environment can reach 10w + QPS. In addition, Nebula Graph storage and indexing rely on the local RocksDB library, not on other big data components, in line with business needs. In terms of big data's ecological support, Nebula Graph supports mainstream Spark (nebula-spark-connector) and Flink (nebula-flink-connector). Nebula Graph also gives us a good experience in terms of community response and feedback timeliness.

Here, let's talk about community support. In the whole process of map database research, we find that compared with mature SQL databases such as MySQL and Oracle, graph database has a shorter development time. The resulting problem is that we encounter some product problems of graph database, and search engines can provide less information. Like OrientDB's frequent error reports before, if the community fails to provide timely technical feedback, users may have to spend a lot of time reading the source code to Debug, resulting in a sharp rise in labor costs and low performance-to-price ratio.

And Nebula Graph gives Zhongan Insurance a very good experience in terms of community support and feedback. As their customers, including in the earliest 1.0, Zhongan Insurance submitted a lot of usage questions to Nebula Graph and bug,Nebula R & D students were able to reply and fix bug in a timely manner. When we encounter production deployment problems when we deploy 2.0, they can also provide technical support in a timely manner. Compared with other graph database manufacturers, this is highly recommended. This is also the fundamental reason why we choose Nebula Graph as the map database to support Zhongan insurance business.

Business practice of financial risk control

The picture below is the architecture diagram of Zhongan Insurance's Nebula Graph-based risk control system, which integrates data processing, processing and cleaning, calculation and drawing service applications.

As shown in the figure above, the bottom layer is the business library, and different business relationship data are stored in different business libraries, including user accessories, devices, GPS, IP and so on.

The upper layer is the processing and cleaning layer of the graph database, which is composed of an offline data warehouse and a real-time data warehouse. The offline data warehouse carries out daily data reflux through DataX, and the data of the backflow service database is stored in ODPS, and Nebula Graph reads the data and writes it to the database through Spark. In the aspect of real-time data warehouse, the data is written to Kafka through the internal monitoring component BLCS of Zhongan Insurance, and then the data is cleaned and processed through the real-time data warehouse built by FlinkSQL, and finally written to Nebula Graph in real time through Flink. In order to ensure the consistency of the data, the real-time data warehouse verifies the data every day. If the data is inconsistent, offline data will be used to make up the missing data.

Above the data cleaning processing layer is the storage-computing layer, and the storage layer is needless to say Nebula Graph. In the aspect of calculation, through the Spark Connector component provided by Nebula Graph, the data in the graph database is read to the Spark platform to execute the prediction model through GraphX, and finally the results are written back to Nebula Graph.

Finally, through the micro-service system of Zhongan Insurance, the graph database is stored and docked to the upper graph application to provide map exploration services, risk control characteristics, case investigation, prediction models and other graph services.

Relation graph

Here is a brief explanation of the relationship map explored by the map community within Zhong'an Insurance. Through the relationship graph above, it specifically introduces how Zhong'an uses the graph database to identify fraud scenarios and how to use the map database to practice risk control characteristics.

There are two types of nodes in the above figure:

People (blue nodes)

Mobile phone (green node)

There are three types of relationships:

Person-[Application]-> Mobile phone

Mobile phone-[contact]-> person

Person-[binding card]-> mobile phone

When I saw the picture above at first glance, I obviously saw two dense hot spots, and the hotspot mobile phone number was filled in by 50 or 60 as the mobile phone number of his family contact. According to common sense, most only-child families in contemporary China, coupled with collateral relationships, it is also very difficult for 50 or 60 people to fill in the same mobile phone number as the mobile number of their family contacts at the same time. Therefore, the person associated with the mobile phone number may be a fraudulent gang member, and the underground industry group may know that one part of the loan scoring system is to score the mobile phone number of the family contact. The gang hopes to improve the credit score by linking the mobile phone number with high credit score.

Based on the above characteristics, we can query the size of the user's community and whether the user makes a preliminary risk control judgment on him in the suspected fraudulent community. Here, even if a user is in an abnormal relationship network, it does not mean that he is a fraudulent user. Being in an abnormal community is a sufficient and necessary condition for judging whether a user is a fraudster or not. Because there is a possibility that the users themselves are not fraudsters, but their immediate relatives participate in intermediary agents and gang fraud, there will be a situation in which both normal and abnormal users have the same relationship network.

Next, we need to dig deep into the "intimacy" dispersion between the user and the center of the exception and explore their path distance. By combining the path search function of Nebula Graph itself, the dispersion (near the outlier or at the edge of the community) is analyzed to determine whether a user is suspected of fraud.

Here, take the mobile phone number as an example to help you understand how Zhongan uses Nebula to identify users' fraud scenarios. In fact, Zhongan Insurance also has equipment, IP and other relationship maps, which will not be discussed here.

Graph model prediction

This part introduces the following chart prediction model.

Connected Component (before loan)

Label Propagation (under loan)

Degree Statistical

The relationship graph introduced above is calculated by the Unicom component (Connected Component) algorithm, which is mainly used in the user credit application before the loan.

Then there is label communication (Label Propagation), which is different from Unicom component, which is more widely used in loans. Label propagation is mainly through a certain point Y to spread and derive its related points. For example, a user in the list of users in a loan is a serious overdue person, who is marked with the overdue label Y, and combines the established risk control rules to see which points in the extended Y association have similar overdue behavior. in order to determine whether these points belong to the seriously overdue community. This is the label propagation algorithm in the loan.

The last algorithm is Degree Statistical, the degree of the whole graph, which is mainly used by risk control personnel. When doing risk control features, risk control personnel may put forward dozens or hundreds of graph features, based on which data need to be verified with historical data to see which features can really identify fraudulent users or seriously overdue users. This verification process, if using traditional data warehouse to do in-depth query through ODPS, is a very inefficient process in terms of execution efficiency, time-consuming, and SQL code writing. However, the point data is read into GraphX through Nebula Graph to calculate the full graph relation degree, and the 7-degree or 10-degree relationship is written back to ODPS in the form of rows, which can be provided to risk control personnel, which can help them to complete the risk control rule making and risk control task more quickly.

Future outlook version planning

When sharing topics, the Nebula version used by Zhongan Insurance is 2.0.1. The watermark watermark feature has been added in Nebula v2.5.0 to prevent queries from taking up too much memory and dragging down the storage process. Zhongan Insurance will deploy version 2.5.0 for verification in the test environment. After the verification is passed, the business line will be gradually cut to version 2.5.0.

More application scenarios

In the future, Nebula may be used in the consanguinity dependence of tables and fields in data warehouse and the management of task relationship of scheduling platform. Students from the basic platform Department of Zhongan Insurance are replacing the existing traditional implementation scheme with Nebula Graph.

The above is about the content of this article on "how to solve the risk control business of Nebula Graph". I believe we all have a certain understanding. I hope the content shared by the editor will be helpful to you. If you want to know more about the relevant knowledge, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report