Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand the data governance framework Amundsen of Github 1.9K Star

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you how to understand the data governance framework Amundsen of Github 1.9K Star. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

Amundsen's mission is to collate all information about the data and make it universally applicable.

This is a sentence on the official website of Amundsen, the management of metadata is complex and tedious. There are many available tools, but each has its own advantages. The better data consanguinity should be Apache Atlas, and the better data visualization should be Apache Superset. The industry has always needed someone to integrate these functions to make data governance easier and easier, and this is the mission of Amundsen.

Similar to Atlas (Apache), Datahub (LinkedIn). Amundsen is mainly about improving the efficiency of data analysts, data scientists and data engineers. It can index data resources and support ranking search on the page through a certain mechanism. You can think of it as a search function, but you are searching for metadata. The project is named after the Norwegian explorer Roald Amundsen, the first person to discover the South Pole.

Amundsen is maintained by the LF AI&Data Foundation. LF AI&Data is Linux Foundation's conservation foundation that supports open source innovation in artificial intelligence, machine learning, deep learning and data.

At present, Amundsen has 1.9kStar in github, but there is no version of Releases yet, and the project is in a period of prosperity.

Architecture

The following figure shows the overall architecture of Amundsen.

It can be seen that data sources such as Hive,Presto obtain metadata through the Databuilder ingestion framework, write it to Elasticsearch and Neo4j, and provide it to the front end through search services and metadata services.

The main modules are as follows:

Front-end service

As a web page for user interaction.

This is a Flask-based Web application, and the page is built by React.

Search service

The search service uses Elasticsearch's search function (or Apache Atlas) and provides a RESTful API service.

Metadata service

The metadata service currently uses Neo4j's graph database to interact.

Function display

Amundsen provides many functions such as search, recommendation, table description and data preview, and the data consanguinity function is under development.

The above is the demonstration of some functions:

Landing page: Amundsen landing page

Search preview: viewing search results

Detail page of the table: visualization of tables such as Hive

Column details: mainly statistics for some columns

Data Preview Page: visualization of table data previews, which can be integrated with Apache Superset or other data visualization tools.

Integration

Amundsen supports a large number of data sources.

Apache Druid,Apache Hive,CSV,Oracle

Mysql,Delta Lake, wait.

Amundsen can also connect to any database that provides a dbapi or sql_alchemy interface.

Amundsen also supports integration with some dashboards, such as

Redash,Tableau .

Integration of ETL tools, Apache Airflow.

BI visualization tool, Apache Superset.

This is how the data governance framework Amundsen of Github 1.9K Star shared by Xiaobian is understood. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report