Part of Technical practice of Druid in Xiaomi Company 04/27 Update SLTechnology News&Howtos

Part of Technical practice of Druid in Xiaomi Company

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Introduction: Druid as an open source real-time big data analysis software, since its birth, with its own excellent characteristics, not only gradually gained more and more popularity and reputation in the technology circle, but also become a key part of many technical team solutions, thus really won a place in the technology stack of many companies.

In this paper, through the introduction of the practical cases and experience of the technical team of Xiaomi on Druid, so that we can have a more comprehensive and in-depth understanding of Druid, hoping to help you learn this young technology of Druid with twice the result with half the effort.

This paper is selected from "principle and practice of real-time big data analysis of Druid".

Founded in April 2010, Xiaomi is an innovative technology company specializing in the construction of high-end smartphones, Internet TV and smart home ecological chain.

"Let everyone enjoy the fun of technology" is Xiaomi's vision. Xiaomi uses the Internet model to develop products, uses the spirit of craftsmen to make products, uses the Internet model to save intermediate links, and is committed to enabling everyone in the world to enjoy high-quality technology products from China.

Schematic diagram of technical architecture of Xiaomi Dayun platform

In the data analysis layer, Druid helps to collect massive event data in real time and do business analysis quickly, which is applied in many scenarios. This paper introduces some of the technical practices of Druid in Xiaomi statistical products and Xiaomi advertising platform.

Scenario 1: Xiaomi statistics service

Xiaomi Statistics is a mobile application data statistics service provided by Xiaomi for App developers, which helps developers understand information such as application development, channel promotion effect and user participation through data, so that developers can better optimize their experience and operation, and promote the continuous development and progress of products. The entry of Xiaomi statistics is tongji.xiaomi.com, and the service interface is as follows.

Xiaomi Statistics Service Interface

The important requirements of real-time data analysis have also gone through several technical stages in the process of product development, which are not completely mutually exclusive, but are applied to different scenarios and times.

The first stage: the data is stored in Hadoop and analyzed and processed by MapReduce script. Some complex tasks are performed on a daily basis, and the results are eventually written to RDBMS such as MySQL.

The second stage: in the process of business development, MySQL quickly becomes a bottleneck for two reasons, one is the high cost of changing the Schema of the database, the new business constantly needs to add new columns and tables, the process is cumbersome and needs to be designed by Schema; second, in the case of a large number of write operations, the increase of database load will lead to the decline of database read performance, and occasionally deadlock phenomenon. In order to solve these problems, HBase is introduced as the main storage database, and the column family of HBase is used to add data columns conveniently. In addition, the availability of HBase is also higher than MySQL.

The third stage: in order to improve the real-time performance of the data, the Storm distributed computing mode is added in the later stage, and all kinds of complex data processing can be easily carried out by using Storm. All kinds of aggregation and processing need to be realized by program, and a data dimension is added. The change is relatively large, and the whole modification from upstream to downstream is needed. The advantage of this method is good reliability, strong data processing ability, and can be optimized from various angles.

The fourth stage: many data queries of Xiaomi statistics select some indicators and filter conditions, and many scenarios are similar to traditional data warehouses, so Druid is introduced to deal with some real-time data query scenarios of standard reports. The data stream passes through Kaa and Tranquility in turn, and finally enters the Druid cluster. The Druid cluster will eventually be able to provide data query capabilities for the most recent day and allow users to access them directly.

Xiaomi statistical data stream

As a real-time analysis database, Druid enhances the real-time data analysis capability of Xiaomi big data platform and commercial product department.

Scenario 2: real-time data analysis of advertising platform

Druid comes from advertising business, and Xiaomi Advertising platform also uses Druid for real-time data analysis to help analyze the changes of various dimensions online in real time, including real-time monitoring and analysis of online deployment, effect query of Amax B test, and some fine-grained data analysis.

There are two ways to deal with advertising data: one is the real-time data stream, which is processed by Druid, mainly for the internal real-time data analysis needs; the other is through Mini-batch.

The DataSource (data source) of the data includes:

Xiaomi Advertising Trading platform (Xiaomi Ad Exchange, MAX): a scheduling management platform for advertising traffic.

The billing analysis module of the advertising platform: the billing of advertisers, various dimensional data.

Advertising media analysis data: the request, presentation and other data of each advertising media.

For example, for the advertising billing analysis module, Druid will include real-time advertiser billing information, which is used for internal data analysis and is not used for advertiser delivery platform. The advertiser delivery platform uses Mini-batch to update and aggregate data results in a replayable way.

There are also some problems in the process of using Druid.

1. About the query interface

The query language of Druid is not particularly friendly. After the deployment of Druid in the first phase, we developed a set of Druid query interfaces, which mainly meet the needs of the business, and the initial results are good. However, with the increase of data sources, we need to develop some additional interfaces, increase dimensions, and modify front-end projects every time we add data sources, so the efficiency is not high. In the later work, I tried the Pivot tool, which is easy to use and gradually replaced the custom query interface.

two。 About query efficiency

Druid performs well most of the time, but if you make a long-range query, the system becomes very slow. In order to solve this problem, the data sources that are queried frequently can be divided into two parts: one is the data sources aggregated at the minute level, the data is kept for 10 days; the other is the data sources aggregated at the hour level, the data is kept for 2 years. Every night, aggregate hour-level data to avoid high-load cluster time. The relationship between aggregation granularity and query efficiency is as follows.

The relationship between aggregation granularity and query efficiency

3. Deployment situation

The Druid cluster handles nearly 10 billion event requests every day, the size of the cluster is nearly 10 machines, the number of indexing services is equal to the number of historical nodes, and the number of machines increases with the increase of the number of events. When the data source increases sharply at a certain time, the CPU occupied by the system index files will be very high, sometimes affecting the normal query performance.

In the first phase, we tried to use flow control at the service layer, but later gave up. The reason is that data has an expiration mechanism after an hour, so if any data cannot enter the system, it may be lost. Therefore, we still try to let the data into the Druid system, although occasionally bring peak pressure to the system.

The Druid-based architecture and data flow are as follows.

Druid-based architecture and data flow

Just like learning other technologies, the best way to master Druid is to practice, so you should start practicing as soon as you have a certain understanding of Druid, and strive to apply it to your actual work as soon as possible, learn to fight in combat, and let the problems encountered in practice drive your study and understanding of Druid technology.

This article is selected from "principles and practice of Real-time big data Analysis of Druid". Click this link to view this book on the official website of the blog.

For more wonderful articles in time, search for "blog viewpoints" on Wechat or scan the QR code below and follow.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.