In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
What this article shares with you is about the optimization of Kafka cluster in the hornet nest big data platform. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article. Without saying much, let's take a look at it with the editor.
Kafka is a popular message queue middleware, which can deal with massive data in real time, has the characteristics of high throughput, low delay and reliable message asynchronous transmission mechanism, and can well solve the problem of data exchange and transmission between different systems.
Kafka is also widely used in hornet hives, providing support for many core services. The following will focus on the application practice of Kafka in the hornet nest big data platform, introducing the relevant business scenarios, what problems we have encountered in different stages of Kafka application, how to solve them, and what plans we have in the future.
Application scenario
Judging from the application scenarios of Kafka on big data platform, it is mainly divided into the following three categories:
The first category is to use Kafka as a database to provide real-time data storage services on big data platform. From the two dimensions of source and use, real-time data can be divided into business-side DB data, monitoring type logs, client logs based on burial points (H5, WEB, APP, Mini Program) and server logs.
The second category is to provide data sources for data analysis, each buried site log will be used as a data source to support and dock the company's offline data, real-time data warehouse and analysis system, including multi-dimensional query, real-time DruidOLAP, log details and so on.
The third category is to provide data subscriptions for business parties. In addition to the applications within the big data platform, we also use Kafka to provide data subscription services for core businesses such as recommendation search, transportation, hotels, content centers, such as user real-time feature calculation, user real-time profile training and real-time recommendation, anti-cheating, business monitoring and alarm, etc.
Four stages
The early big data platform introduced Kafka as a business log collection and processing system, mainly considering its high throughput and low latency, multiple subscriptions, data backtracking and other characteristics, which can better meet the needs of big data scenarios. However, with the rapid increase of business volume and the problems encountered in business use and system maintenance, such as imperfect registration mechanism and monitoring mechanism, the problems can not be located quickly, and the failure of some online real-time tasks leads to a backlog of messages, which challenges the stability and availability of Kafka clusters, and has experienced several serious failures.
It is urgent and difficult for us to solve the above problems. In view of some pain points in the use of Kafka on big data platform, we have done a series of practices from cluster use to application layer expansion, which includes four stages as a whole:
The first phase: version upgrade. Focusing on some bottlenecks and problems in the production and consumption of platform data, we selected the technology for the current version of Kafka and finally decided to use version 1.1.1.
The second stage: resource isolation. In order to support the rapid development of business, we have improved the construction of multi-clusters and the resource isolation between Topic in clusters.
The third stage: access control and monitoring alarm.
First of all, in terms of security, the early Kafka clusters were running naked. Because multiple product lines share Kafka, it is easy to cause data security problems due to misreading the Topic of other businesses. Therefore, we add the function of authentication based on SASL/SCRAM+ACL.
In terms of monitoring and alarm, Kafka has become the standard configuration of input data sources in real-time computing, so the backlog and throughput of Lag have become important indicators of the health of real-time tasks. Therefore, big data platform built a unified Kafka monitoring and alarm platform and named "radar" to monitor Kafka clusters and users in multiple dimensions.
The fourth stage: application expansion. In the process of opening up to the company's business lines in the early days, Kafka was not properly used by some business parties due to the lack of unified usage standards. In order to solve this pain point, we have built a real-time subscription platform, which enables the business side to apply for data production and consumption, user authorization, user monitoring and alarm, and many other links to automate the process through the form of application services. to create an overall closed loop from the use of the demand side to the omni-directional management and control of resources.
The above is how the optimization of the Kafka cluster in the hornet nest big data platform is. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.