What are the four operation modes of Apache Flink on K8s? 07/19 Update SLTechnology News&Howtos

What are the four operation modes of Apache Flink on K8s?

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you what are the four operation modes of Apache Flink on K8s. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

1. Preface

Apache Flink is a distributed stream processing engine, which provides rich and easy-to-use API to handle stateful stream processing applications, and runs such applications efficiently and on a large scale on the premise of supporting fault tolerance. By supporting fault-tolerant guarantees of event time (event-time), computing status (state) and just once (exactly-once), Flink has been quickly adopted by many companies and become a new generation of stream computing processing engine.

1.1 Why did Flink choose Kubernetes

Kubernetes project originates from Google internal Borg project. Based on Borg's excellent practice over the years and its advanced design concept, and with the endorsement of many powerful families and large manufacturers, Kubernetes has grown into a de facto standard in the field of container management. In big data and related fields, many well-known products, including Spark,Hive,Airflow,Kafka, are moving to Kubernetes,Apache Flink.

Flink chose Kubernetes as its underlying resource management platform for two reasons:

1) Flink features: streaming services are generally resident processes, which are often used in scenarios that require high stability, such as telecom network quality monitoring, ad hoc analysis of business data, real-time risk control and real-time recommendation.

2) Kubernetes advantages: it provides a better publishing and management mechanism for online business, and ensures its stable operation. At the same time, Kubernetes has good ecological advantages and can be easily integrated with various operation and maintenance tools, such as prometheus monitoring, mainstream log collection tools, etc.; at the same time, K8S provides a good capacity expansion mechanism in terms of resource flexibility, which greatly improves the utilization of resources.

1.2 History of Flink on Kubernetes

In the early release 1.2 of Flink, the Flink Session cluster mode was introduced, and users were able to deploy Flink clusters on top of Kubernetes clusters.

With the gradual popularization of Flink, more and more Flink tasks are submitted to the user's cluster. Users find that under the session mode, the tasks will influence each other and the isolation is poor. Therefore, in the version of Flink 1.6, the Per Job mode is introduced, and a single task occupies a Flink cluster, which greatly improves the stability of the task.

After meeting the stability, users feel that these two modes do not create resources on demand, and often need to specify the specifications of Flink clusters in advance based on user experience. Under this background, native session mode is applied. In the Beta phase of Flink version 1.10, we have added native per job mode, which improves the isolation between applications on the basis of resource application on demand.

According to the trend of Flink running mode on Kubernetes cluster, the characteristics of these modes are analyzed in turn, and the Flink operator scheme and its advantages are introduced at last.

2. Flink operation mode

This paper first analyzes the two deployment modes in which Apache Flink 1.10 has been GA (production available) on the Kubernetes cluster, and then analyzes the native session deployment mode in the Beta version and the native per-job deployment model to be released in Flink 1.11. Finally, according to the advantages and disadvantages of these deployment models, the current comparative native kubernetes deployment mode, flink-operator.

The version of Flink we are using already supports native session and native per-job very well, and we also support these two modes in flink-operator.

Next, we will analyze the operation mode of Flink in the following order, and readers can consider the appropriate Flink operation mode according to their own business scenarios.

Flink session mode

Flink per-job mode

Flink native session mode

Flink native per-job mode

The comparison of the advantages and disadvantages of these four deployment models can be summarized in the following table. For more information, please refer to the following detailed description.

2.1 Session Cluster mode

2.1.1 introduction to the principle

In Session mode, the Flink cluster is running for a long time. When the Master component of the cluster receives the task submitted by the client, the task is analyzed and processed. After the user submits the resource description file of the Flink cluster to Kubernetes, the FlinkMaster and TaskManager of the Flink cluster will be created. As shown in the following figure, TaskManager will register with the ResourceManager module after startup, and the Flink Session cluster is ready. When the user submits the Job task through the Flink Clint side, the Dispatcher receives the task request and forwards the request to the JobMaster, and the JobMaster assigns the task to the specific TaskManager.

2.1.2 characteristic analysis

This type of Flink cluster, FlinkMaster and TaskManager, run in the Kubernetes cluster for a long time in the form of Kubernetes deployment. The Flink session cluster must be created before submitting the job. Multiple tasks can run in the same cluster at the same time, sharing K8sResourceManager and Dispatcher between tasks, but JobMaster is separate. This method is suitable for scenarios that run short-term jobs, impromptu queries, frequently submit tasks, or are sensitive to the length of task startup.

* * advantages: * * FlinkMaster and TaskManager are ready when the job is submitted. When there are sufficient resources, the job can be immediately assigned to TaskManager for execution without waiting for the creation of resources such as FlinkMaster,TaskManager,Service.

Disadvantages: 1) you need to create a Job cluster before submitting a Flink task, and you need to specify the number of TaskManager in advance, but it is difficult to accurately grasp the specific resource requirements before submitting the task. If you specify more, a large number of TaskManager will be idle, and the resource utilization will be relatively low. If you specify less, there will be tasks that cannot allocate resources, and you can only wait for other jobs in the cluster to complete the execution and release the resources. The next job will be executed normally.

The isolation is poor, and multiple Job tasks compete for resources and influence each other; if a Job exception causes TaskManager crash, then all Job tasks running on this TaskManager will be restarted; and, worse still, the restart of multiple Jobs tasks and a large number of concurrent access to the file system will lead to the unavailability of other services Finally, on Rest interface, you can see the Job tasks of others in the same session cluster.

2.2 Per Job Cluster mode

As the name implies, this method creates a separate Flink cluster for each Job task. When the resource description file is submitted to the Kubernetes cluster, Kubernetes will create FlinkMaster Deployment, TaskManagerDeployment and run the task in turn. When the task is completed, these Deployment will be cleaned automatically.

2.2.1 characteristic analysis

Advantages: isolation is good, there is no resource conflict between tasks, and a single task uses a single Flink cluster; compared with Flink session clusters, resources are built on demand, and resources are destroyed immediately after task execution, resulting in higher resource utilization.

Disadvantages: the number of TaskManager needs to be specified in advance. If less TaskManager is specified, the job will fail, and if more is specified, the resource utilization will be reduced. Resources are created in real time, and users' jobs need to wait for the following process before being run:

Kubernetes scheduler applies for resources for FlinkMaster and TaskManager and schedules them to the host to create them.

Kubernetes kubelet pulls FlinkMaster and TaskManager images, and creates FlinkMaster and TaskManager containers

After TaskManager starts, register with Flink ResourceManager.

This mode is suitable for jobs that are insensitive to startup time and run for a long time. It is not suitable for scenarios that are sensitive to task startup time.

2.3 Native Session Cluster mode

2.3.1 principle analysis

Flink provides an entry script kubernetes-session.sh for Kubernetes mode. When the user executes the script, the Flink client will generate Kubernets resource description files, including FlinkMaster Service,FlinkMasterDeloyment,Configmap,Service and set owner reference. In Flink version 1.10, FlinkMaster Service is used as the Owner of other resources, which means that when deleting the Flink cluster, only FlinkMaster service needs to be deleted and other resources will be automatically deleted.

After receiving the resource description request from Flink, Kubernetes starts to create FlinkMaster Service,FlinkMaster Deloyment and Configmap resources. As you can see from the figure, along with the creation of FlinkMaster, Dispatch and K8sResMngr components are also created. Here K8sResMngr is the core component of Native mode, which communicates with Kubernetes API server and applies for TaskManager resources. Currently, users can submit task requests to the Flink cluster.

Users submit tasks to the Flink cluster through Flink client, and flink client generates Job graph, which is then uploaded with the jar package. When the task is submitted successfully, JobSubmitHandler receives the request and submits it to Dispatcher and generates JobMaster, and JobMaster is used to apply for task resources from KubernetesResourceManager.

Kubernetes-Resource-Manager generates a new configuration file for taskmanager, including the address of service, so that when Flink Master is rebuilt abnormally, it ensures that taskmanager can still connect to the new Flink Master through Service.

When TaskManager is successfully created and registered to slotManager, slotManager applies to TaskManager for slots,TaskManager to provide its own free slots, and the task is deployed and run.

2.3.2. Characteristic analysis

In the two deployment modes we mentioned earlier, to run a Flink task on Kubernetes, you need to specify the number of TaskManager in advance, but in most cases, users cannot accurately predict the number and specification of TaskManager required for the task before it starts.

If you specify more, you will waste resources, and if you specify less, you will fail to execute the task. The most fundamental reason is that there is no Native to use Kubernetes resources. Native here can be understood as Flink communicating directly with Kuberneter to apply for resources.

This type of cluster is also created before the task is submitted, but only contains FlinkMaster and its Entrypoint (Service). When the task is submitted, Flink client will calculate the degree of parallelism based on the task, and then determine the number of TaskManager required, and then the Flink kernel will directly apply for taskmanager from Kubernetes API server to achieve the purpose of resource dynamic creation.

Advantages: compared with the first two clusters, the resources of taskManager are created in real time and on demand, the utilization of resources is higher, and the required resources are more accurate.

Disadvantages: taskManager is created in real time, and users, like Per Job clusters, still need to wait for the creation of taskManager before their jobs are actually run, so users who are sensitive to task startup time need to make some tradeoffs.

2.4 Native Per Job mode

In the current version of Apache Flink 1.10, the Flink native per-job feature has not yet been released and is expected to be available in a subsequent version of Flink 1.11, so we can take a look at the features of native per job in advance.

2.4.1 principle analysis

When the task is submitted, Flink also requests resources from kubernetes. The process is similar to the native session model mentioned earlier, except that:

Flink Master is created dynamically with the submission of a task

Users can package Flink, job Jar packages and classpath dependencies into their own images

The job run diagram is generated by Flink Master, so there is no need to upload Jar packages through RestClient (figure 2, step 3).

2.4.2. Characteristic analysis

Native per-job cluster also creates a Flink cluster when the task is submitted. The difference is that there is no need for the user to specify the number of TaskManager resources, because also with the help of the features of Native, Flink communicates directly with Kubernetes and requests resources as needed.

Advantages: apply for resources on demand, suitable for one-time tasks, release resources immediately after the task is executed, and ensure the utilization of resources.

Disadvantages: resources are created after the task is submitted, which also means that some tradeoffs are needed for scenarios that are sensitive to delay after the task is submitted.

3. Introduction to Flink-operator3.1

Based on the analysis of the above four deployment models, we find that for the use of Flink clusters, users often need to maintain their own deployment scripts and submit various underlying resource description files (Flink Master,TaskManager, configuration files, Service) to Kubernetes.

Under session cluster, if the cluster is no longer used, you also need to delete these resources by yourself, because the resources in this kind of cluster use Kubernetes's garbage collection mechanism owner reference. When deleting a Flink cluster, you need to delete the Owner of the resource and delete it, which is not very friendly to Flink users who are not familiar with Kubernetes.

Through Flink-operator, we can describe the Flink cluster as a yaml file, so that with the declarative features of Kubernetes and the coordination controller, we can directly manage the Flink cluster and its jobs without paying attention to the creation and maintenance of the underlying resources such as Deployment,Service,ConfigMap.

Currently, Flink has not officially given a flink-operator scheme, but GoogleCloudPlatform provides a flink-operator scheme based on kubebuilder. Next, we will introduce how to install flink-operator and an example of managing a Flink cluster.

3.2 principles and advantages of Flink-operator

When Fink operator is deployed to the Kubernetes cluster, the FlinkCluster resource and Flink Controller are created. FlinkCluster is used to describe Flink clusters, such as JobMaster specifications, TaskManager and TaskSlot numbers, etc. Flink Controller handles CRUD operations for FlinkCluster resources in real time, and users can manage Flink clusters in the same way as built-in Kubernetes resources.

For example, the user describes the desired Flink cluster through the yaml file and submits it to Kubernetes, Flink controller analyzes the user's yaml, gets the FlinkCluster CR, and then calls API server to create the underlying resources, such as JobMaster Service, JobMaster Deployment,TaskManager Deployment.

By using Flink Operator, you have the following advantages:

1. It is more convenient to manage Flink clusters

Flink-operator makes it easier for us to manage Flink clusters. We do not need deployment scripts to maintain various underlying resources of Kubenretes for different Flink clusters. The only thing we need is a custom resource description file of FlinkCluster. You only need a kubectl apply command to create a Flink session cluster. The following figure shows the yaml file of the Flink Session cluster. Users only need to declare the desired Flink cluster configuration in this file, and flink-operator will automatically complete the creation and maintenance of the Flink cluster. If you create a Per Job cluster, you only need to declare the attributes of the Job in the yaml, such as the Job name and the Jar package path. Through flink-operator, the four Flink operation modes mentioned above can each correspond to a yaml file, which is very convenient.

ApiVersion: flinkoperator.k8s.io/v1beta1kind: FlinkClustermetadata: name: flinksessioncluster-samplespec: image: name: flink:1.10.0 pullPolicy: IfNotPresent jobManager: Cluster ports: ui: 8081 resources: limits: memory: "1024Mi" cpu: "200m" taskManager: replicas: 1 resources: limits: memory: "2024Mi" cpu: "200m" volumes :-name: cache-volume emptyDir: {} volumeMounts:-mountPath: / cache name: cache-volume envVars:-name: FOO value: bar flinkProperties: taskmanager.numberOfTaskSlots: "1"

two。 Declarative type

By executing script commands to create the underlying resources of the Flink cluster, users are required to ensure that the resources are created successfully in turn, often accompanied by auxiliary checking scripts. With the controller mode of flink operator, users only need to declare the desired state of the Flink cluster, and the rest of the work is guaranteed by Flink operator. In the process of running the Flink cluster, if there is a resource exception, such as JobMaster accidentally stopped or even deleted, Flink operator will rebuild these resources and automatically repair the Flink cluster.

3. Custom SavePoint

The user can specify the autoSavePointSeconds and save path, and Flink operator automatically saves the snapshot for the user on a regular basis.

4. Automatic recovery

Streaming tasks tend to run for a long time, even if they don't stop for 2-3 years. During the execution of a task, there may be a variety of reasons for the failure of the task. The user can specify a task restart policy when specified as FromSavePointOnFailure,Flink operator automatically re-executes the task from the nearest SavePoint.

5. Sidecar containers

Sidecar container is also a design pattern provided by Kubernetes. Users can run sidecar container in TaskManager Pod to provide auxiliary custom services or proxy services for Job.

6. Ingress integration

Users can define Ingress resources, and flink operator will automatically create Ingress resources. Kubernetes clusters hosted by cloud vendors generally have Ingress controllers, otherwise users need to implement Ingress controller on their own.

7. Prometheus integration

You can integrate with Prometheus in a Flink cluster by specifying metric exporter and metric port in the Kubernetes cluster's yaml file.

These are the four operating modes of Apache Flink on K8s shared by the editor. If you happen to have similar doubts, please refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.