Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How Uber uses Apache Hudi to analyze global networks in near real time

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article will explain in detail how Uber uses Apache Hudi to analyze the global network in near real time. The content of the article is of high quality, so the editor shares it for you as a reference. I hope you will have some understanding of the relevant knowledge after reading this article.

The scale of Uber business is growing rapidly, covering 600 cities in 60 countries, with a total of 10B orders.

And almost all of them use Uber through mobile phone App, while 100% rely on mobile phone network, which needs to monitor the reliability of the network in near real time.

The signal strength of wireless network varies from place to place.

And with the change of time, the signal strength also changes, the cellular network quality, the change of network format and so on.

There are many challenges to network performance, such as too many dimensions, too much data and so on.

A less efficient solution is to recalculate all data using batch processing, but the overhead is too high (data is read repeatedly), the same data is repeated, and the results are updated too slowly.

Using incremental processing, that is, only processing updates to the data source, incrementally updating the results, the results can be calculated more quickly.

You can use Apache Hudi for incremental pull

Streaming processing is introduced to big data, which only carries on the incremental processing to the changed data, which reduces the delay and has better scalability.

Based on the implementation architecture of Hudi, Hudi manages files based on statistical information, provides different views for different upper-level applications, and is more general. Changes to DB are imported into kafka, then consumed with Hudi (DeltaStreamer) every few minutes, and then written to the Hudi dataset, providing three views (read optimized view, real-time view, incremental view) on the dataset for upper-level applications to use.

Hudi has built a super-10PB data lake in Uber, 1000 pipeline/ tables, and processes 100TB data every day.

Hudi's incremental model uses micro-batch tasks (minutes), supports upsert (insert update) result sets, and supports incremental pulling of data from changing data sources.

Incremental pipeline and display panel based on Hudi

You can use Spark DataSource API or DeltaStreamer to read the data source / write the Hudi dataset.

Build incremental pipeline to incrementally update network metrics

After Hudi incremental pull processing, the results of previous processing will be merged.

Incremental update indicator

The overall pipeline uses two-phase incremental updates, the first stage results in the Sketch table (temporary table), the second stage merged into the Summary table (final result summary table), both phases involve the merging of results.

Both Delta sketch and Delta summary are implemented using DeltaStreamer provided by Hudi.

Summary of Hudi's practical experience in Uber, including testing, operation and maintenance, and monitoring

Incremental pipeline settings for production environment

The runtime introduction of pipeline, daily 100GB, batch update pipeline using 1200core, incremental pipeline using 150core.

About how Uber uses Apache Hudi near real-time analysis of the global network to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report