Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the design idea of Storm?

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "what is the design idea of Storm". In the daily operation, I believe that many people have doubts about what the design idea of Storm is. The editor consulted all kinds of materials and sorted out a simple and easy-to-use operation method. I hope it will be helpful to answer the doubt of "what is the design idea of Storm?" Next, please follow the editor to study!

Overview of Real-time Computing

Unlike traditional offline batch operations (operations on many sets of data), real-time processing, to put it bluntly, is to operate on a single piece of data / record. all of these operations are summarized (the sum of all the statistics so far).

Real-time computing is compared with offline computing Bounded: the operation data faced by bounded offline computing are bounded, whether 1G, 1T, 1P, 1EB, 1NB data are bounded, which will inevitably lead to bounded UnBounded: the operation data faced by × × real-time computing is the same as a continuous stream of water, there is no boundary, and the × × of the data will inevitably lead to the calculation of × ×.

Instructions from Flink's official website:

First, 2 types of datasets Unbounded: Infinite datasets that are appended to continuously Bounded: Finite, unchanging datasetsSecond, 2 types of execution models Streaming: Processing that executes continuously as long as data is being produced Batch: Processing that is executed and runs to completeness in a finite amount of time Releasing computing resources when finished big data deals with 6 major problems 3 major computing centers offline batch processing quasi-real-time stream computing 3 major computing engines user interactive computing engine: SQL/ES graph computing engine machine learning computing engine Storm

ApacheStorm is an open source real-time data processing framework similar to Hadoop by Twitter. It was originally developed by BackType, then BackType was acquired by Twitter, and Storm is used as the real-time data analysis system of Twitter.

Storm can realize real-time processing of high-frequency data and large-scale data.

The official website shows that a node of storm can handle 1 million 100-byte messages per second (IntelE5645@2.4Ghz 's CPU,24GB memory). (that is, a single node approximately processes data around 95MB per second)

Official website:

Comparison between Storm and Hadoop

Data source

HADOOP deals with TB-level data (historical data) on HDFS, while STORM deals with a new piece of data in real time (real-time data)

Treatment process

HADOOP is divided into MAP stage to REDUCE stage. STORM is a user-defined process, which can contain multiple steps, each of which can be a data source (SPOUT) or processing logic (BOLT).

Whether it is over or not

The HADOOP will end at last, and the STORM will not end. At the last step, stop there and start all over again when new data comes in.

Processing speed

HADOOP is for the purpose of dealing with TB-level data on HDFS, and the processing speed is slow. STORM only needs to deal with a new piece of data, which can be done quickly.

Applicable scenario

HADOOP is used when dealing with batch data, without paying attention to timeliness, while STORM is used when dealing with some new data, which is about timeliness.

The Design idea of Storm

Storm is an abstraction of convective Stream, and a stream is an uninterrupted × × × continuous tuple. Note that when modeling the event flow, Storm abstracts the events in the stream as tuple, that is, tuples.

Storm abstracts the elements in the stream as Tuple, and a tuple is a list of values-each value in valuelist,list has a name, and the value can be a primitive type, a character type, a byte array, etc., and, of course, other serializable types.

Storm believes that every stream has a stream source, the source of the original tuple, so it calls this source Spout.

With the source, that is, spout, that is, you have stream, so how do you deal with tuple in stream? The state transition of a stream called Bolt,bolt can consume any number of input streams, as long as the flow side wizard is directed to the bolt, and it can also send a new stream to other bolt for use, so that as long as you open a specific spout (port) and direct the tuple from the spout to a specific bolt, and bolt processes the imported stream and then directs it to other bolt or destinations.

The above process is collectively referred to as Topology, or topology. Topology is an abstract concept at the highest level in storm, which can be submitted to a storm cluster for execution. A topology is a flow conversion graph in which each node is a spout or bolt. The edges in the graph represent which streams bolt subscribes to. When spout or bolt sends tuples to the stream, it sends tuples to each bolt that subscribes to the stream (which means we don't need to pull the pipeline manually, just subscribe in advance. Spout will send the stream to the appropriate bolt).

Each node in the topology indicates the name of the field of the tuple that it emits, and other nodes only need to subscribe to the name to receive processing.

At this point, the study of "what is the design idea of Storm" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report