What are the common usage scenarios of message queuing in big data 07/15 Update SLTechnology News&Howtos

What are the common usage scenarios of message queuing in big data

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about the common use scenarios of message queues in big data, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following for you. I hope you can get something from this article.

I. brief introduction

Message queuing middleware is an important component in distributed system, which mainly solves the problems of application coupling, asynchronous message, traffic sharpening and so on. Achieve a high-performance, high-availability, scalable, and ultimately consistent architecture. The most frequently used message queues are ActiveMQ, RabbitMQ, ZeroMQ, Kafka, MetaMQ, and RocketMQ.

Second, message queuing application scenario

The following describes the commonly used scenarios of message queue in practical applications: asynchronous processing, application decoupling, traffic sharpening and message communication.

1. Asynchronous processing

Scenario description: after registration, users need to send registration email and registration SMS. There are two traditional methods: serial mode and parallel mode.

Serial mode: after the registration information is successfully written into the database, send the registration email, and then send the registration text message. When all the above three tasks are completed, return to the customer.

Parallel mode: after the registration information is successfully written into the database, the registration email is sent and the registration SMS is sent at the same time. After the above three tasks are completed, return to the client. The difference with serial is that parallel approach can improve processing time.

Assuming that each of the three business nodes uses 50 milliseconds, regardless of other overhead such as the network, the serial time is 150 milliseconds, and the parallel time may be 100 milliseconds.

Because there is a certain number of requests processed by CPU per unit time, it is assumed that the throughput of CPU1 is 100 per second. Then the number of requests that can be processed by CPU in serial mode in 1 second is 7 (1000,150). The number of requests processed in parallel is 10 times (1000 Universe 100).

Summary: as described in the above case, the traditional system performance (concurrency, throughput, response time) will have a bottleneck. How to solve this problem?

The introduction of message queuing will not be a necessary business logic for asynchronous processing. The modified structure is as follows:

According to the above convention, the response time of the user is equal to the time that the registration information is written to the database, that is, 50 milliseconds. Register the mail, send a short message to write to the message queue, and return directly, so the speed of writing to the message queue is very fast and can be ignored, so the user's response time may be 50 milliseconds. Therefore, after the architecture change, the throughput of the system is improved to 20QPS per second. It is three times higher than serial and two times higher than parallel!

2. Application decoupling

Scenario description: after the user places an order, the order system needs to notify the inventory system. Traditionally, the order system calls the interface of the inventory system. As shown below:

Disadvantages of the traditional model:

If the inventory system is not accessible, the order reduction will fail, resulting in the order failure, and the order system is coupled with the inventory system.

How to solve the above problems? The solution after introducing the application message queue, as shown below:

Order system: after the user places an order, the order system completes the persistence processing, writes the message to the message queue, and returns the user's order to be issued successfully.

Inventory system: subscribe to the order message and use the pull / push method to obtain the order information. The inventory system carries out inventory operation according to the order information.

If: the inventory system does not work properly when placing an order. It also does not affect the normal issuance of the order, because after the order is issued, the order system writes to the message queue and no longer cares about other subsequent operations. Realize the application decoupling of order system and inventory system.

3. Flow cutting

Traffic sharpening is also a common scenario in message queues, which is widely used in second kill or group robbery activities.

Application scenario: flash sale activity, usually because the traffic is too large, the traffic will surge and the application will fail. In order to solve this problem, it is generally necessary to join the message queue at the front end of the application.

The number of activities can be controlled, which can alleviate the collapse of high traffic applications in a short period of time.

After receiving the user's request, the server first writes to the message queue. If the length of the message queue exceeds the maximum, the user request is discarded directly or jumps to the error page.

The second kill service does subsequent processing according to the request information in the message queue.

4. Log processing

Log processing refers to the use of message queues in log processing, such as the application of Kafka, to solve the problem of mass log transmission. The architecture is simplified as follows:

The log collection client is responsible for log data collection and regular writing to Kafka queues; Kafka message queues are responsible for receiving, storing and forwarding log data; log processing applications: subscribe to and consume log data in kafka queues.

The following is an application case of Sina kafka log processing:

Kafka: message queue that receives user logs

Logstash: do log parsing, unify the output to JSON to Elasticsearch

Elasticsearch: the core technology of real-time log analysis service, a schemaless, real-time data storage service, organizes data through index and has powerful search and statistics functions.

Kibana: data visualization component based on Elasticsearch, super data visualization ability is an important reason for many companies to choose ELK stack.

5. Message communication

Message communication means that message queues generally have built-in efficient communication mechanisms, so they can also be used in pure message communication. Such as implementing peer-to-peer message queues, or chat rooms, etc.

Point-to-point communication:

Client An and client B use the same queue for message communication.

Chat room newsletter:

Client A, client B and client N subscribe to the same topic to publish and receive messages. Achieve a chat room-like effect.

The above are actually two message modes of message queuing, peer-to-peer or publish-subscribe mode. The model is a schematic diagram for reference.

Message middleware example 1. E-commerce system

Message queuing uses highly available and persistent message middleware. Like Active MQ,Rabbit MQ,Rocket Mq.

After the application completes the trunk logic processing, it is written to the message queue. Whether the message is sent successfully or not can turn on the confirmation mode of the message. (after message queuing returns the status of successful message reception, the application returns, thus ensuring the integrity of the message.)

Expand the process (SMS, delivery processing) to subscribe to queue messages. Use push or pull to get the message and process it

While the message decouples the application, it brings the problem of data consistency, which can be solved in the final consistency way. For example, the master data is written to the database, and the extended application implements the subsequent processing based on the message queue according to the message queue and the database method.

2. Log collection system

It is divided into four parts: Zookeeper registry, log collection client, Kafka cluster and Storm cluster (OtherApp).

Zookeeper registry, proposing load balancing and address lookup services

The log collection client collects logs from the application system and pushes the data to the kafka queue

Kafka cluster: receiving, routing, storing, forwarding and other message processing

Storm cluster: at the same level as OtherApp, the data in the queue is consumed by pull.

After reading the above, do you have any further understanding of the common usage scenarios of message queuing in big data? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.