In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article shares with you the content of a sample analysis of time in Flink. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.
First, the time supported by fink
The streaming application of Flink supports different views of time.
1, processing time
Processing time refers to the system time of the machine performing the corresponding operation.
When the stream runs with processing time, all time-based operations, such as time windows, use the system clock of the machine running their respective operators. For example, the hourly processing time window will include all records that arrive between specific operations between the time displayed by the system clock for one hour.
Processing time is the simplest concept of time and does not require coordination between the flow and the machine. It provides the best performance and the lowest latency. However, in both distributed and asynchronous environments, processing time does not provide determinism because it is vulnerable to the speed at which records arrive at the system, such as from message queues, and also related to the speed of flow between operators recorded within the system.
2, event time
The event time is the time that each event occurs on its production equipment. This time is usually embedded in the event they enter the fink and the event timestamp can be extracted from the event. The hourly event time window will contain all events that contain the event timestamp to that time, regardless of when the events arrive and the order in which they arrive.
Event time gives the correct results, even in disordered events, lagging events, or playback data from backups or persistent logs. With event time, the progress of time depends on the data, not the clock on the wall. The event time program must specify how the event time Watermarks is generated, which is the mechanism that signals within the event time. The mechanism is described below.
Event time processing usually produces a certain delay because it has the characteristic of waiting for a specific time for later events and unordered events. Therefore, inter-event-based programs are often combined with processing time operations.
3, injection time
The injection time is the time when the event enters the flink. Each event in the Sources operator takes the current time of the Sources as a timestamp, and time-based operations (such as windows) depend on this timestamp.
The injection time is conceptually between the event time and the processing time. It consumes slightly more performance than processing time, but provides predictable results. Because the injection time uses a fixed timestamp (allocated once at Sources), different window operations use the same time, while using processing time each window operation may be assigned to a different time window of the message (based on local system time).
Compared to the event time, the injection time program cannot handle any time-free or lag data, but the program does not need to specify how to generate the watermark.
Internally, the injection time and event time are very similar, but the injection time has the functions of automatic timestamp allocation and automatic watermark generation.
Second, set the time characteristic
The first part of a flink stream program is often to set the basic time characteristics. This setting determines how the Sources header of the stream operates (such as whether to assign a timestamp) while confirming the window operation (such as KeyedStream.timeWindow (Time.seconds (30)).) How to use the concept of time.
The following flink program shows the aggregation of events in an hour window. The behavior of the window adapts to the time characteristics.
Final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment ()
Env.setStreamTimeCharacteristic (TimeCharacteristic.ProcessingTime)
/ / alternatively:
/ / env.setStreamTimeCharacteristic (TimeCharacteristic.IngestionTime)
/ / env.setStreamTimeCharacteristic (TimeCharacteristic.EventTime)
DataStream stream = env.addSource (new FlinkKafkaConsumer09 (topic, schema, props))
Stream
.keyBy ((event)-> event.getUser ())
.timeWindow (Time.hours (1))
Reduce ((a, b)-> a.add (b))
.addSink (...)
Note that in order to run this example using the event time, the program uses Sources to directly define the event time of the data and determine the watermark, or the program must inject a Timestamp Assigner & Watermark Generator after the Sources. These functions mainly describe how to use the event timestamp and the degree of disorder shown by the event flow.
The following sections describe the general mechanisms for timestamping and watermark. To guide the use of timestamp allocation and Flink watermark generation in data flow API, a later article will describe it.
Third, event time and watermark
A stream processor that supports event time needs a way to measure the progress of time. For example, the operation of an hour window windows needs to be notified when the event time has exceeded an hour, so that the operator can close the window in progress.
The event time can advance independently of the processing time. For example, in a program, the current event time of the operator may lag slightly behind the processing time (caused by the delay in receiving the event), and both proceed at the same speed. On the other hand, another stream program may take only a few seconds to process a few weeks of event time, by quickly processing some historical data that has been cached in a kafka topic (or another message queue).
Watermark is used in Flink to measure the progress of event time. The Watermark stream, as part of the data flow, carries a timestamp t. A Watermark (t) declares that the event time has arrived t, which means that there is no event time T1
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.