In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the relevant knowledge of "how to use Flink". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
What is Flink?
Flink is a framework and a distributed processing engine for stateful computing over finite (bounded) or infinite (unbounded) data streams.
Processing framework
The software stack of Flink is shown in figure 1, and its core is that distributed dataflow engine is used to execute data flow processors. The Flink runtime program is a directed acyclic graph (DAG) of a data stream connected by a stateful operator, providing DataSet API of finite data streams and DataStream API of infinite data streams on the pair.
As shown in figure 2, the Flink cluster consists of three types of roles, client, JobManager, and TaskManager. Client converts the data handler to a DAG diagram and submits it to JobManager. JobManager coordinates the execution of the program and tracks the status of each operator for fault recovery. TaskManager receives the Task that needs to be deployed from JobManager and is responsible for the execution of specific data handlers. A TaskManager executes one or more operators to process the data flow and reports the status to JobManager.
The operator here is an independent data processing program, commonly used are map, flatmap, keyBY, sum, apply, reduce, window and so on. The difference between map and flatMap is that map is an one-to-one mapping, with one input corresponding to one output. FaltMap is an one-to-many mapping, with one input corresponding to zero or more outputs.
Through the above discussion, the essence of Flink program is to use multiple operators to form a directed acyclic graph. It is not difficult for flink programs to understand this. Here is a simple example:
Simple exampl
Time
Three concepts of time are defined in Flink, namely Event Time,Ingestion Time and Processing Time.
Processing Time, as its name implies, is the system time to handle received events, and because it does not require time coordination between data streams and processing machines, it has the lowest latency. However, in both distributed and asynchronous environments, Processing Time cannot provide certainty because it is easily affected by the speed at which events arrive in the Flink system, the speed at which events flow within the Flink system, and interrupts.
Event Time is the time when the event occurs, which generally refers to the timestamp carried by the data itself. The Event Time program must specify how to generate the Event Time watermark, which is the mechanism for indicating the progress of the Event Time. Ideally, no matter when the events arrive or how they are sorted, the final processing of the Event Time will produce exactly the same and definitive results. However, in fact, unless the events arrive in a known order (according to the time of the events), there will be a delay in processing Event Time by waiting for some unordered events. Because the Flink program can only wait for a limited period of time, it is difficult to guarantee that processing Event Time will produce completely consistent and definitive results.
Ingestion Time is the time when the event enters the flink system. The Ingestion Time program cannot handle any unordered events or delayed data, but the program does not have to specify how to generate the watermark. In Flink, Ingestion Time is very similar to Event Time, but Ingestion Time has the functions of automatically assigning timestamps and automatically generating watermarks.
The relationship between the three times can be vividly shown by a picture:
That's all for the content of "how to use Flink". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.