Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze the one-stop data Application Development and Management Portal DataSphere Studio

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about how to analyze the one-stop data application development and management portal DataSphere Studio. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.

"DataSphere Studio (DSS for short) is an one-stop data application development and management portal developed by WeBank. Based on plug-in integration framework design and computing middleware Linkis, it can be easily connected to various upper Web systems, making data development simple and easy to use."

01

-

What is DSS?

DataSphere Studio (DSS for short) is positioned as a data application development portal, and the closed loop covers the whole process of data application development. Under the unified UI, the workflow graphical drag-and-drop development experience can meet the requirements of the whole process of data application development, such as data import, desensitization cleaning, analysis mining, quality inspection, visual display, timing scheduling, data output application and so on.

DSS through plug-in integration framework design, so that users can easily customize the development of DSS, simple and rapid integration of various Web systems, in a unified page, can meet all the business needs of users.

As needed, users can simply and quickly replace various functional components that have been integrated by DSS, or add new functional components.

With the connection, reuse and simplification capabilities of Linkis computing middleware, DSS naturally has the execution and scheduling capabilities of financial-level high concurrency, high availability, multi-tenant isolation and resource control.

02

-

Why do I need DSS?

With the wide application of big data technology, the development of data application is no longer the processing and production of several reports. How to achieve rapid interaction between business and data, how to generate reports quickly and efficiently, and how to assist business decision-making are the core demands of almost all enterprises. However, the reality is that business users are often at a loss and do not know how to choose in the face of a large number of functional data application systems.

The following six pain points are a headache for almost all enterprises: there are many data application systems, no unified user entry, and a strong sense of fragmentation in user experience. The business process involves multiple systems cooperating with each other, and users need to switch systems frequently in order to achieve business. The boundaries of many data application systems are not clear, the overlap of functions is not only a great waste of manpower, but also difficult to cooperate and communicate between systems, and users need to spend time to investigate and compare repeatedly in order to finalize the plan. Cross-departmental and cross-business data dependence all depends on the verbal agreement of ready time. If the upstream data is delayed, the downstream will have a chain reaction and lead to data disaster. The sharing of data and information between systems requires pairwise development adaptation, complex invocation and high coupling. Without a unified integration framework, integration between systems requires a variety of development adaptations.

03

-

Core concepts of DSS

The five core concepts put forward by DSS focus on solving the six pain points mentioned above.

1. One-stop-one-stop is the first step for DSS to improve the active participation of business users in data development. DSS provides an one-stop data application development management interface, so that users no longer need to inquire and discuss in order to confirm whether there are tools that can meet the requirements. All data development can be completed by finding components on DSS.

DSS is highly integrated. The latest open source version of the integrated systems are:

Data development and exploration of Scriptis

Data Visualization Visualis (based on the secondary development of Credit Davinci)

Data quality Qualitis

Dispatching system Azkaban

The DSS plug-in framework design pattern allows users to quickly replace each Web system that DSS has integrated. For example, replace Scriptis with Zeppelin and Azkaban with DolphinScheduler.

DSS one-stop data application development portal enables users to form the good habit of searching DSS when needed and exploring other functional components of DSS when there is no need. two。 Fully connected to the DSS workflow drag and edit page, all data applications integrated by DSS will appear in the form of workflow nodes, one node corresponds to a system function, so that the functional boundary is clear, and users no longer need to do multiple choice questions. DSS workflow node, which supports embedding the front-end interface of the integrated data application system, so that users can edit and modify all business functions in one workflow page. DSS workflow enables users to connect multiple business functions from a business perspective and organize them into workflows that support real-time execution and timing scheduling, and the whole process development of data applications can be completed by simply dragging and dropping. In WeBank, through the workflow of DSS, the iterative cycle of business data applications has been reduced from 1 week to 1 day, and the efficiency has been increased by 600%. DSS workflow, so that users can easily and quickly implement the business, while helping users to better understand the business.

3. Plug-and-plug is the most important feature of DSS as a data application integration framework. DSS is like a slot, plug-and-plug design, almost no intrusion into the original external system, and only need to do a simple adaptation, can be quickly integrated. DSS through plug-in integration framework design, so that users can easily customize the development of DSS, simple and rapid integration of various Web systems, in a unified page, can meet all the business needs of users. Through plug-and-pull, the functional components of WeDataSphere can not only be independent of each other, the system boundary is clear, but also can be organically integrated together to form an one-stop, fully connected big data experience of WeDataSphere.

4. What is context? All the information necessary to keep an operation going. If you read three books at the same time, the page number of each book is the context in which you continue to read the book. DSS context to solve the problem of data and information sharing of DSS workflow across multiple system nodes. For example, system B needs to use a piece of data generated by system A, and the usual practice is as follows: system B calls the data access interface developed by system A, system B reads the data written by system A to a shared storage DSS WorkflowContext implemented by Linkis computing middleware, and allows the accessed external system to share node information and node data with other external system nodes as sharing nodes or reading nodes. There is no need for pairwise development adaptation of external systems to reduce the call complexity and coupling between systems. With the help of DSS context, WeBank WeDataSphere has been completely decoupled, and the complexity of each functional component has been reduced by at least 30%.

5. The problem of signalling cross-departmental and cross-business data dependence has always been recognized as a major problem in the industry. For example, the data Mart of Department B depends on part of the data of Department A DWD (Data Warehouse Detail data detail layer). How to ensure that Department B officially starts data processing only after the data processing of Department An is completed? The usual practice is that both parties agree on a time window and Department An ensures that the data is ready. The middle window period not only greatly reduces the timeliness of data processing, but also once the data processing of department An is delayed, the downstream will lead to disaster. As a data application development portal, DSS proposes a set of signal-based data dependence solutions. The data application system connected to DSS only needs to add an information node in front of it, which can realize the problem of data-dependent cooperative execution across services and multiple systems. Through DSS signalling, WeBank makes the data dependence of various businesses across multiple systems simple, clear and efficient, increasing the data output of the business by 30% on average, and reducing the data delay rate by 90%.

04

-

Core design concept of DSS

The socket of AppJoint,DSS plug-and-pull architecture is the cornerstone of DSS to build one-stop, fully connected, plug-and-pull, context. AppJoint is the core concept that DSS can easily and quickly integrate all kinds of upper Web systems.

What is AppJoint?

AppJoint-- application joint, based on Linkis computing middleware, defines a set of unified and standardized front and background access specifications, so that external data applications can be easily and quickly connected to DSS. The four specifications of AppJoint make it clear and convenient to access the data application system of DSS.

The Security specification and the Project specification are the core abstractions that implement an one-stop shop. Security specification to solve the cross-domain problem of login between DSS and the front and background of the external system. Project specification, which connects the organizational structure and authority system of DSS and external system, is a general standard to realize DSS collaborative development. The NodeService specification and the NodeExecution specification are the core cornerstones of achieving full connectivity. NodeService specification to connect DSS workflow nodes with external systems. NodeExecution specification to realize the interactive execution of tasks between DSS workflow nodes and external systems. AppJoint also introduces Linkis computing middleware to enable connected external data applications to quickly have the concurrent current limit and user resource access capabilities of Linkis. And WorkflowContext based on Linkis allows context information to be shared across system nodes and bid farewell to application islands completely.

05

-

DSS integrated data application components

Through the implementation of multiple AppJoint, DSS has integrated a variety of upper Web application systems, which can basically meet the data development needs of users.

If necessary, users can also easily integrate new Web application systems to replace or enrich DSS's data application development process.

1. Data development-Scriptis what is Scriptis?Scriptis is a data analysis Web tool that supports online writing of scripts such as SQL, Pyspark, HiveQL and submitted to Linkis for execution, and supports enterprise-level features such as UDF, function, resource control and intelligent diagnosis. Scriptis AppJoint integrates the data development capabilities of Scriptis for DSS and allows various script types of Scriptis to participate in the application development process as nodes of DSS workflow. Script node types such as HiveSQL, SparkSQL, Pyspark, Scala and so on are supported.

2, data visualization-Visualis what is Visualis?Visualis is a data visualization BI tool, based on the second customized development of trusted open source component Davincip. Visualis AppJoint integrates the data visualization capabilities of Visualis for DSS and allows large data screens and dashboards to be associated with upstream data marts as nodes of DSS workflows.

3. The scheduling ability of DSS-many data applications of Azkaban users usually want to have periodic scheduling ability. At present, the existing open source scheduling system in the market has a low degree of integration with other upper data application systems, and it is difficult to integrate. Through the implementation of Azkaban AppJoint, DSS allows users to schedule an orchestrated workflow and publish it to Azkaban at one click. DSS also defines a set of standard and general Linkis workflow parsing and publishing specifications for scheduling systems, so that other scheduling systems can easily interface with DSS at low cost.

4. Data quality-QualitisQualitis AppJoint integrates data quality verification capability for DSS, integrates data quality system into DSS workflow development, and verifies the integrity and correctness of data.

5. Data sending-SenderSender AppJoint is an integrated data sending capability for DSS. Currently, it supports SendEmail node type, and the result sets of all other nodes can be sent by email. For example, a SendEmail node can directly send Display data to a large screen as an email.

6. Data signal-Signal

Signal AppJoint is used to strengthen the decoupling and interrelation between business and process. DataChecker node: check whether the database table partition exists. EventSender: message sending nodes across workflows and projects. EventReceiver: message receiving nodes across workflows and projects. 7. Function node empty node and sub-workflow node.

8. Node expansion

As needed, users can simply and quickly replace various functional components that have been integrated by DSS, or add new functional components. The above is the editor for you to share how to analyze the one-stop data application development and management portal DataSphere Studio, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report