In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Wen | big data, manager of Zheng Linfeng Caitong Securities
Exchange Wechat | datapipeline2018
Caitong Securities Co., Ltd. is a comprehensive securities company established with the approval of the China Securities Regulatory Commission. Zhejiang Financial Securities Company, founded in 1993, is now an enterprise directly under the Zhejiang provincial government. mainly engaged in securities brokerage, securities investment consulting, securities self-management, securities underwriting and recommendation, margin financing, securities investment funds consignment, consignment of financial products and other business.
As an indispensable part of the company, the data team of Caitong Securities manages more than 60 million-100 million pieces of data every day, providing stable and reliable data information for different levels and types of services of the company.
In the new era of artificial intelligence, in order to achieve batch data integration, the Caitong team abandoned the old integration tools and chose DataPipeline products to complete the task configuration that used to take 50 hours in 5 minutes. In addition, the unique jumping machine setting of DataPipeline reduces the potential management burden of the data team.
The pain point of the data team of small and medium-sized securities firms
About 40 medium and large securities companies in 120 brokerages across the country have established independent basic data departments, while for nearly 80 small and medium-sized securities companies, the data teams are in the process of being set up, or in the state of secondary departments.
For small and medium-sized securities firms, a big pain point is the problem of data integration. This is because the human resources of the data set are very limited, and the data integration requires high performance and stability, tedious development, frequent changes and can not be outsourced. For data integration, the old version of ETL data integration tools used by most brokerage platforms have low efficiency in extraction task development, scheduling management and testing because of the single table level granularity. The characteristic of the brokerage data flow is that the task is based on the liquidation state. When the upstream production system completes the liquidation, the data task starts to fetch data to the intermediate database, and when the data fetch task is completed, the downstream system consumption data is triggered.
For the enterprise-level brokerage platform, the preliminary data collection does not need to do complicated cleaning and conversion work, only need to provide source data to downstream partners for processing and processing.
Secondly, the current commonly used extraction tools can not control the resources relatively finely. Due to the strong production nature of the upstream system, the brokerage system requires high resource consumption for data collection. The early warning mechanism of securities firms basically starts early warning when the system flow reaches more than 30%.
The user side of the data has no verification rules and its own redundancy mechanism, and all the pressure is on the data layer on the source side. With the increasing scale of controlled data, the risk of problems with source-end data is also increasing, causing data teams to fill out event forms.
In addition, for financial enterprises, data security is the top priority, so the data of the core system is isolated through the network gate. When using the old data integration tools, due to the characteristics of the old data integration tools, the overall services of the data team must be placed on the internal network. Once the task fails, the team must go to the intranet machine on the site to operate. Operation and maintenance is very difficult.
Solution
We (Caitong Securities) chose to work with DataPipeline, the leading real-time data pipeline technology, breaking the shackles of traditional tools on ETL. Based on the open underlying platform of DataPipeline, Caitong Securities has developed the functions of monitoring and early warning, data verification, personalized scheduling and so on. With the combination of product and open API, it has realized the data integration scheme in line with the application scenario of the securities industry.
Batch accelerated extraction
In the current era of big data, the data processing process has changed, from the previous single meter collection, cleaning and conversion, ETL to simple data collection without conversion directly into the database. All the transformations are cleaned and transformed (EL) through big data technology after the data is stored in the database.
At present, in the market, more data acquisition granularity is still at the single table level, and operations such as visual conversion and cleaning are needed, which is a waste of unnecessary time.
DataPipeline adapts to the needs of the times, adopts batch collection method, and collects tens of hundreds of tables in the same system at the same time, which greatly improves the data collection efficiency of our (Caitong Securities).
Monitoring of resources
The old data integration tools and other extraction tools will completely release the ability of the extraction process and have a good extraction speed, but because there is no way to carry out unified task control, this will cause a lot of pressure on the database of the upstream system.
Using traditional integration tools, we can consume up to 50% of the performance of the system production repository, and the number of traffic per second in a single library is close to 100000, but this triggers the early warning of the upstream system. In order to ensure the safety and stability of the production system, the acquisition system must carry out wave crest and current limit.
The tool of DataPipeline defines the double threshold of collection number and collection flow, and because its task is to limit the total value of all tables under the whole task, the granularity is more suitable for the use of enterprise-level unified collection tools, which ensures the security of enterprise applications.
Implementation of springboard machine
As a financial enterprise, data security is the top priority, so the data of the core system is isolated through the network gate, how to quickly extract data from different network environments, then it needs to be processed through the springboard mode.
Through the way of the jumping machine, DataPipeline makes the jumping machine undertake the data transfer service, and the control terminal collected as a whole is stored in a non-intranet environment to ensure that problems can be directly managed and checked in the external environment.
It is worth mentioning that DataPipeline is the only company in the market that can do this.
Considerations in the era of artificial Intelligence
What brokers used to pursue is high-quality available data (structured data), such as visual stock prices, economic data and so on. In the era of artificial intelligence, more dimensions, a larger number of basic data (structured or unstructured data) is more important, so more tables need to be collected, and the data is also distributed in more business systems. The database types of each system are also different, so there is a need for heterogeneous databases to be extracted to a specific database.
Message middleware is more used in the industry, while DataPipeline builds middleware between upstream data sources and downstream databases, and uses a general middleware architecture to unify unstructured and structured data.
Efficient service and visible results
The duration of DataPipeline's R & D team is not limited to product completion. After delivering the products, the DataPipeline team responded quickly to customers' various industry characteristics in a short time, and provided Caitong with quality and timely service in line with the principle of customer supremacy.
Talk to your company (DataPipeline) about the optimization of demand in March, and the revision will come out soon. Basically, some specific needs of the brokerage industry can be well met.
The result of this efficiency is not surprising: by extracting structural data alone, DataPipeline has "cracked" the older version of data integration tools. Using the old version of the tool, the extraction configuration of hundreds of tables takes 50 hours, while the DataPipeline with batch collection can be completed in 5 minutes.
Conclusion
Because of the data synchronization and concentration in the financial industry, ETL has very high requirements for performance and stability. And ETL development is tedious, frequent changes and can not be outsourced, which has become a pain point for all securities firms. As a typical brokerage company, Caitong Securities, with the help of DataPipeline, has realized more agile, more efficient and simpler integrated services such as real-time data fusion and data management from complex heterogeneous data sources to destinations, so that the pain points of small and medium-sized securities firms have been effectively solved, and have made full preparations for the arrival of a new era.
-end-
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.