Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Flink-CEP in big data's Development

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "how to use Flink-CEP in the development of big data", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "how to use Flink-CEP in big data development" this article.

The summary is: input-rule-output

Even the self-correlation of a single event, in fact, matches the time series.

Define the basis

(1) defining compound event processing (Complex Event Processing,CEP) is an analysis technique based on event flow in dynamic environment, in which events are usually meaningful state changes. By analyzing the relationship between events, using the techniques of filtering, association, aggregation, etc., and making detection rules according to the timing and aggregation relations between events, we can continuously query the event sequences that meet the requirements from the event flow. Finally, more complex compound events are obtained by analysis.

(2) the characteristics of the feature CEP are as follows: goal: to find some high-order features from an ordered simple event stream; input: an event stream composed of one or more simple events; reason: to identify the internal relationship between simple events, and multiple simple events that conform to certain rules to form complex events; output: complex events that meet the rules

(3) function

CEP is used to analyze low-latency, frequently generated streams of events from different sources. CEP can help to find meaningful patterns and complex relationships in complex and irrelevant time flows to get notifications or organize behaviors in near real-time or quasi-real time. CEP supports pattern matching on the stream, which can be divided into continuous conditions or discontinuous conditions according to the different conditions of the mode; the conditions of the pattern allow a time limit, and when the satisfied conditions are not reached within the range of conditions, it will lead to pattern matching timeout. It looks simple, but it has many different functions: ① inputs stream data to produce results as soon as possible; ② calculates aggregation classes based on time on two event streams; ③ provides real-time / quasi-real-time warnings and notifications; ④ generates correlation analysis patterns in a variety of data sources ⑤ high throughput, low latency processing market has a variety of CEP solutions, such as Spark, Samza, Beam, etc., but they do not provide specialized library support. However, Flink provides a specialized CEP library.

(4) the main component Flink provides a special Flink CEP library for CEP.

It contains the following components: Event Stream, Pattern definition, Pattern detection, and generation of Alert. First, the developer defines the out-of-mold condition on the DataStream stream, and then the Flink CEP engine detects the pattern and generates warnings if necessary.

Mode API in CEP

(1) individual patterns (Individual Patterns) make up every single of complex rules.

The definition of a single pattern is the individual pattern.

Start.times (3) .where (_ .roomvior.startswith ('fav'))

(2) combined patterns (Combining Patterns, also known as pattern sequences) many individual patterns are combined to form the whole pattern sequence. Pattern sequence

Must start with an initial mode:

Val start = Pattern.begin ('start')

(3) Group of Pattern sets a sequence of patterns as conditions within individual patterns to form a group of patterns.

Individual model

Individual mode includes singleton mode and circular mode. Singleton mode receives only one event, while loop mode can receive multiple events

(1) quantifiers can be appended to an individual pattern, that is, to specify the number of cycles.

/ / 4 times of start.time (4) / / 0 or 4 times of start.time (4). Optional// matching occurs 2, 3 or 4 times start.time (2) / 4 times, and matches start.time (2) as many times as possible. Greedy// match occurs one or more times start.oneOrMore// match occurs 0, 2 or more times And match start.timesOrMore (2). Optional.greedy as many times as possible

(2) condition each mode needs to specify the trigger condition, which is used as the basis for judging whether the mode accepts the event entry or not. Individual patterns in CEP specify conditions mainly by calling .where (), .or (), and .criteria (). According to different calling methods, it can be divided into the following categories: the ① simple condition determines and filters the fields in the event through the .where () method to decide whether to receive the event or not.

Start.where (event= > event.getName.startsWith ("foo"))

The ② combination condition combines simple conditions; the or () method is expressed or logically connected, and the direct combination of where is equivalent to and. Pattern.where (event = >... / some condition/) .or (event = > / or condition/) ③ termination condition if oneOrMore or oneOrMore.optional is used, it is recommended to use .terminate () as the termination condition to clean up the status. The ④ iteration condition can handle all events received before the pattern; call .where (value,ctx) = > {... }), you can call ctx.getEventForPattern ("name")

Pattern sequence

(1) strict nearest neighbor

All events appear in strict order, with no mismatched events, specified by .next (). For example, for the pattern "a next b", there is no match for the event sequence "apene creb b1 ref b2". (2) loose neighbors allow mismatched events in the middle, as specified by .followedBy (). For example, for the modular "a followedBy b", the event sequence "arecrum crecrium b1rem b2" matches as {arecrine b1}. (3) non-deterministic loose neighbors further relax the conditions, and previously matched events can also be used again, as specified by .followedByAny (). For example, for the pattern "a followedByAny b", the event sequence "ameme crecrime b1reb 2" matches as {ab1}, {arecine b2}. In addition to the above pattern sequence, you can also define "do not want some kind of nearest neighbor relationship": .notNext (): do not want an event to be strictly adjacent to the previous event. .notFollowedBy (): you don't want an event to happen between two events. It should be noted that:

All pattern sequences must start with .begin ()

Pattern sequence cannot end with .notFollowedBy ()

Patterns of type "not" cannot be modified by optional

You can specify a time constraint for the pattern to require how long the match is valid. Next.within (Time.seconds (10))

Pattern detection

Once the sequence of patterns to be found is determined, it can be applied to the input stream to detect potential matches. Call CEP.pattern (), given the input stream and pattern, and you get a PatternStream.

Val input:DataStream [Event] =... Val pattern:Pattern [Event,_] =... Extraction of val patternStream:PatternStream [Event] = CEP.pattern (input,pattern) matching events

After you create the PatternStream, you can use the select or flatSelect methods to extract events from the detected sequence of events. The select () method needs to enter a select function as a parameter, which is called by every sequence of events that successfully matches. Select () receives the matched sequence of events as a Map [String,Iterable [IN]], where key is the name of each pattern and value is the Iterable type of all received events.

Def selectFn (pattern: Map [String,Iterable [IN]]): OUT= {val startEvent = pattern.get ("start"). Get.next val endEvent = pattern.get ("end") .get.next OUT (startEvent, endEvent)}

FlatSelect implements similar functions to select by implementing PatternFlatSelectFunction. The only difference is that the flatSelect method can return multiple records, and it passes the output data downstream through a parameter of type Collector [out].

Extraction of timeout events

When a pattern defines the detection window time through the within keyword, part of the event sequence may be discarded because it exceeds the window length; in order to be able to handle

The parts of these timeouts match, and select and flatSelect API calls allow you to specify a timeout handler.

Flink CEP development process:

Data in DataSource is converted to DataStream

Define Pattern and convert the combination of DataStream and Pattern to PatternStream

PatternStream is converted to DataStraem by select, process and other operators.

After the re-converted DataStream is processed, the sink goes to the target library.

Select method: SingleOutputStreamOperator result = patternStream.select (orderTimeoutOutput, new PatternTimeoutFunction () {@ Overridepublic PayEvent timeout (Map map, long l) throws Exception {return map.get ("begin"). Get (0);}, new PatternSelectFunction () {@ Overridepublic PayEvent select (Map map) throws Exception {return map.get ("pay"). Get (0);}})

The selection function is applied to the detected pattern sequence. For each pattern sequence, call the provided {@ link PatternSelectFunction}. Mode selection function

Only one result element can be produced.

Apply a timeout function to a partial pattern sequence that times out. For each partial pattern sequence, call the provided {@ link PatternTimeoutFunction}. Pattern

The timeout function can produce only one result element.

You can obtain {@ link on {@ link SingleOutputStreamOperator} that uses the same {@ link OutputTag} for select operations.

The timeout data stream generated by {@ link SingleOutputStreamOperator} generated by SingleOutputStreamOperator}.

@ param timedOutPartialMatchesTag identity output @ link OutputTag} of timeout mode

@ param patternTimeoutFunction is the pattern timeout function called for each partial pattern sequence that times out.

@ param patternSelectFunction calls the pattern selection function for each detected pattern sequence.

The type of timeout element generated by @ param

Type of @ param result element

Return {@ link DataStream}, which contains the resulting element and the timeout element generated in the edge output.

DataStream sideOutput = result.getSideOutput (orderTimeoutOutput)

Gets {@ link DataStream}, which contains elements emitted by the operation to the edge output of the specified {@ link OutputTag}

Flink CEP development process

Data in DataSource is converted to DataStream;watermark and keyby

Define Pattern and convert the combination of DataStream and Pattern to PatternStream

PatternStream is converted to DataStream by sele ct, process and other operators.

After the re-converted DataStream is processed, the sink goes to the target library

The main principles of CEP implementation

FlinkCEP converts the user's logic into such a NFA Graph (nfa object) at run time, so the working process of a finite state machine is the process of automatic state transition from the starting state to different inputs.

The function of the state machine in the figure above is to detect whether the binary number contains even zeros. As you can see from the figure, there are only 1 and 0 inputs. From the beginning of the S1 state, only entering 0 will transition to the S2 state, and in the same S2 state, only entering 0 will transition to S1. Therefore, after the binary number is input, if the final state is full, that is, the final state stops at S1, then the input binary number contains an even number of 0 big data developers.

The above is all the contents of the article "how to use Flink-CEP in big data's Development". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report