Apache Flink official documentation-concept 07/08 Update SLTechnology News&Howtos

Apache Flink official documentation-concept

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Data flow programming model

Original text link

Blogger's understanding

Level of abstraction

Flink provides different levels of abstraction to develop streaming / batch applications.

This lowest level of abstraction provides stateful streaming operations. It is embedded into DataStream API through handlers. It allows users to freely handle events in one or more data streams and use consistent, fault-tolerant states. In addition, users can register event time and processing time callback, allowing the program to implement complex calculations. In fact, most applications do not need the low-level abstraction described above, but instead target Core APIs (core API), such as DataStream API (bounded and unbounded data flows) and DataSet API (bounded datasets). These smooth API provide general data processing, such as various forms of transformations (transformation), joins (connection), aggregations (aggregation), windows (windowing operation), state (state), etc. These API representations are processed for data types in classes (class) in their respective programming languages.

's low-level handlers integrate DataStream API so that low-level abstractions can be used for feature operations. DataSet API provides additional primitives for bounded data sets, such as loops / iterations.

Table API is a table-centric declarative DSL that can be changed dynamically (when processing data streams). Table API follows the extension model: Table has an additional schema (similar to relational database tables) and API provides similar operations, such as select,project,join,group-by,aggregate, and so on. Table API declaratively defines how a logical operation should be done rather than exactly what the code for the operation looks like. Although table API can be extended through various forms of user-defined functions, it still does not perform as well as Core APIs, but it is more concise to use (writing less code). In addition, Table API can also execute an optimizer, which is suitable for execution before optimizing rules.

There is a seamless conversion between Table and DataStream/DataSet, allowing programs to combine Table api with DataStream and DataSet's API. The highest level of abstraction for Flink is sql. This abstraction is similar to Table API in semantics and expression, but represents the program as an SQL query expression. The SQL abstraction is closely tied to Table API, and Sql queries can be executed in tables defined by table API. Program and data flow

The basic building blocks of the Flink program are streams (stream) and transformations (transformation). It is important to note that the DataSets used by Flink's DataSet API is also a stream-more will be explained later. Conceptually, a stream (which may not end) is a data flow record, while a transformation is an operation that takes one or more streams as input and produces one or more output streams as a result.

When is executed, the Flink program maps to streaming dataflows (stream data stream), which consists of streams and transformation operators. Each data stream starts at one or more source and ends at one or more sink. Data flow is similar to any directed acyclic graph (DAGS). Although specific forms of rings are allowed through iterative construction, in most cases, for the sake of simplicity, we do not consider this.

In , the transformation in the program corresponds to the operation in the data flow one by one. Sometimes, however, a transformation may consist of multiple transformation operations.

Documentation for source and sink is available at streaming connectors and batch connectors. The documentation for Transformation is available at DataStream operators and DataSet transformation.

Parallel data flow

Flink programs are parallel and distributed in nature. During execution, a stream contains one or more flow partitions (stream partition), and each operator contains one or more operator subtasks. Operation subtasks are independent of each other, execute in different threads, and may even run on different machines or containers.

The number of operator subtasks is the degree of parallelism of this particular operator. The parallelism of a stream is the parallelism of its production operator. Different operator in the same program may have different levels of parallelism.

streams transfer data between two operator, either through one-to-one (or forwarding) mode, or through redistributing mode:

One-to-one (for example, between Source and map () opreator in the figure above) maintains the partitioning and sorting of elements. That means that the subtask [1] of map () operator will see the same elements in the same order as the subtask [1] of Source. Redistributing flows (such as between map () and keyBy/window in the figure above, and between keyBy/window and Sink) change the partition of the flow. Each operator subtask sends data to a different target subtask according to the selected transformation. Examples include keyBy () (repartition based on the hash of key), broadcast (), or rebalance () (random repartition). In an redistributing exchange, the sorting between elements is retained only in each pair of send and receive subtasks (for example, subtasks [1] of map () and subtasks [2] of keyBy/window). So in this example, the order of each key is preserved, but parallelism does introduce uncertainty-the order in which the aggregate results for different keys reach the sink.

The detailed configuration of configuration and parallelism can be viewed in this document parallel execution. Window (Window)

aggregation events, such as counting and summing, work differently on streams than batches. For example, it is impossible to count all elements in a convection because the flow is usually infinite (unbounded). Instead, aggregations on the stream need to be scoped by a window, such as "calculate the past 5 minutes" or "the sum of the last 100 elements".

windows can be event-driven (for example, every 30 seconds) or data-driven (for example, every 100 elements). Windows are usually divided into different types, such as scrolling windows (no overlap), sliding windows (overlapping), and session windows (interrupted by inactive gaps).

More window examples of can be seen on this blog. For more details, you can view the window document window docs.

Time (Time)

when referring to time in streaming programs (such as defining windows), you can refer to different concepts of time:

The event time is the time the event was created. It is usually described by a timestamp in an event, such as attaching to a production sensor, or a production service. Flink accesses the event timestamp through the timestamp allocator. The ingestion time is the time when the event enters the Flink data stream source operator. The processing time is the local time of each operator that performs the time operation.

See the document event time docs for more details about the operation time. Stateful operation

although many operations in the data flow view only one independent event at a time (such as an event parser), some operations record information between multiple events (such as a window operator). These operations are called stateful.

The state of stateful operations is stored in a section that can be considered embedded key / value storage. Streams whose states are read by stateful operator are strictly partitioned and distributed together. Therefore, you can only access the key / value state of keyed streams after one keyBy () function, and is limited to the values associated with the current event key. The keys to adjust the flow and state ensure that all state updates are local operations to ensure consistency without transaction overhead. This alignment also allows Flink to transparently reassign the state and adjust the partition of the flow.

For more information, please see the contents of this document about state.

Fault tolerant checkpoint

Flink uses a combination of stream playback and checkpoints for fault tolerance. Checkpoints are related to a specific point of each input stream and to the status of each associated operator. A data flow can be recovered from a checkpoint where consistency is maintained by restoring the operator state and replaying events from the checkpoint (processing semantics at once)

checkpoint intervals are a means of eliminating fault-tolerant overhead during execution in terms of recovery time (the number of events that need to be replayed).

The description within fault tolerance provides more about flink management checkpoints and related topics. For more information on enabling and configuring checkpoints, see this document, checkpointing API docs.

Flow batch processing

Flink executes the batch program as a special case of the stream handler, except that the stream is bounded (limited elements). The DataSet is treated internally as a data stream. The above concept that applies to stream processors also applies to batch programs, with a few exceptions:

The fault tolerance of the batch program no longer uses checkpoints. Instead, it is restored by completely replaying the stream. Because the input is bounded, this is feasible. This approach increases the cost of recovery, but reduces the overhead of normal processing because checkpoints are avoided. Stateful operations in DataSet API use a simplified in-memory/out-of-core data structure instead of a key / value index. DataSet API introduces a special superstep-base iteration that can only be performed on a bounded stream. See the iteration documentation for details. Distributed runtime

Original text link

Task and Operator chain

for distributed running, Flink links the operator subtasks together into the task pool. Each task is executed by a thread. Linking operator to the task pool is a useful optimization: it reduces thread-to-thread switching and buffering overhead and increases overall throughput while reducing latency. You can configure the link behavior. For more information, refer to the link documentation.

The sample data flow in the following figure of is executed by five subtasks, so there are five parallel threads.

Job Manager, Task Manager, client

The Flink runtime consists of two types of processes:

The job manager (JobManagers, also known as the master node master) is responsible for coordinating the distributed runtime. They schedule tasks, coordinate checkpoints, coordinate failure recovery, and so on.

has at least one job manager node, and a high-availability environment has multiple job managers, one of which is the leader role and the other is the standby role. The task manager (TaskManagers, also known as the worker node worker) performs the tasks of the data flow (more specifically, subtasks), buffers, and transforms the data flow.

also has at least one task manager node.

The job manager and task manager can be started in a variety of ways: directly on the host as a standalone cluster, or managed by the resource manager YARN or Mesos. Task managers connect to the job manager and declare that they are available and can be assigned tasks.

The client is not part of the program runtime, but it is often used to prepare and send data flow programs to the job manager. After that, the client can disconnect or keep the connection ending process report. The client can trigger execution as a Java/Scala program, or run it on the command line ". / bin/flink".

Task Slot and Resources

each Worker node (task manager) is a JVM process that can perform one or more subtasks in separate threads. A Worker controls how many tasks the node accepts by controlling the task slots (at least one).

each task slot represents a fixed-size subset of resources for the task manager. For example, a task manager with 3 slot will allocate 1 slot 3 memory it manages to each slot. Slot resources means that subtasks do not compete for managed memory with subtasks of other jobs, but have a certain amount of reserved administrative memory. Note that there is no CPU isolation happening here, and now you can only separate the administrative memory of the task.

By adjusting the number of task slot, allows users to define how subtasks are isolated. Having a slot per task manager means that the task group runs on an isolated JVM (for example, it can be started on an isolated container). Having multiple slots means that more subtasks share the same JVM. Tasks on the same JVM share TCP connections (through multiplexing) and heartbeat information. They can also share datasets and data structures, reducing the overhead of each task.

By default, allows subtasks to share slot, even if they are subtasks of different tasks, as long as they come from the same job. The result is that an slot owns all the pipeline operations (pipeline) for the job. There are two main benefits of allowing this slot sharing:

The Flink cluster requires the same task slots as the highest parallelism used in the job. There is no need to calculate the total number of tasks the program contains (on top of different parallelism). It is easier to get better resource utilization. Without sharing slot, non-intensive source and map () subtasks will block as many resources as resource-intensive window subtasks. By sharing slot, increasing parallelism from 2 to 6 can make full use of the slotted (time slot) resources, while ensuring that heavy subtasks are fairly distributed on the task manager.

APIs also includes a resource group mechanism to prevent unwanted slot sharing.

as a rule of thumb, a good default task slot (task slot) number is the number of CPU cores. With hyperthreading, two or more hardware thread contexts are required for each slot. Status backend (State Backends)

The exact data structure of the storage key / value pair index depends on the selected state backend. One state backend stores data in a hash map in memory, and the other state backend uses [RocksDB] () as the key / value pair storage. In addition to defining the data structure that saves the state, the state back end implements logic to take a point-in-time snapshot of the key / value pair state and store the snapshot as part of the checkpoint.

SavePoint Savepoints

Programs written by in Data Stream API can resume execution from the save point. SavePoint allows you to update programs and Flink clusters without losing any state.

The SavePoint is a manually triggered checkpoint that takes a snapshot of the program and writes it to the status backend. They rely on regular checkpoint mechanisms. During execution, the program periodically creates snapshots and generates checkpoints on the work node. For recovery, only the last completed checkpoint is required, so once the new checkpoint is complete, the old checkpoint can be safely discarded.

Savepoints are similar to these periodic checkpoints, except that they are triggered by the user and do not automatically expire when newer checkpoints complete. You can create a save point from the command line or when you cancel a job through REST API.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.