Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

User behavior based on ClickHouse what is big data's architecture?

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the ClickHouse-based user behavior big data architecture is how, the content is very detailed, interested friends can refer to, hope to be helpful to you.

SDK burial point collection behavior data source terminals include iOS, Android, Web, H5, WeChat Mini Programs and so on. The SDK of different terminals adopts the SDK of the corresponding platform and mainstream language, and the data collected by the buried point is submitted to the server API by HTTP POST through the JSON data.

The server API consists of a data access system, which uses Nginx to receive the data sent through API and write it to the log file. Use Nginx to achieve high reliability and high scalability.

For the log printed by Nginx to the file, the Source module of Flume will read the Nginx log in real time, and the Channel module will process the data, and finally publish the processing result to Kafka through the Sink module.

Kafka is a widely used distributed message queue with high availability, which serves as a buffer between data access and data processing, as well as a backup of recent data.

During Flume processing, the test data is identified as test data according to the version number and will be written to the test branch of kafka. This branch will write the JSON data of the behavior log to MySQL to provide developers with confirmation during the development and debugging process. It has no impact on online business.

When the production data is identified in Flume, it is written to the production branch of kafka. At the back end, Flink performs the necessary ETL and real-time dimension join operations on the data in Kafka to form standard detail data, which is written back to Kafka for downstream and other businesses to use. Then the detailed data is written into ClickHouse and Hive respectively into a wide table through Flink, the former as the core of query and analysis, and the latter as backup and data quality assurance.

On the ClickHouse-based user behavior big data architecture is shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report