Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to deploy clusters in flink

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article shows you how to deploy clusters in flink, the content is concise and easy to understand, it will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

MiniCluster

This mode is usually used when debugging a program with IDE. When we develop a program locally with IDE, execute the main method, and flink will locally start a minicluster of a process containing jobmanager and taskmanager. After the program is finished, the cluster process exits.

Standalone

This mode starts the flink cluster directly on the physical machine. We can use {FLINK_HOME} / conf/flink-conf.yaml.

In addition, we can start another taskmanager with ${FLINK_HOME} / bin/taskmanager.sh start.

At this point, let's look at the started process with the jps command.

76085 StandaloneSessionClusterEntrypoint

76331 TaskManagerRunner

76846 TaskManagerRunner

We saw that two taskmanager were started at this time.

The resource management of flink cluster in this deployment model is maintained by flink itself and is not often used in production environment, so we don't describe it too much.

All yarnyarn session clusters deployed on yarn clusters hand over the management of resources to yarn. The deployment mode of yarn session is to start a flink cluster on the yarn cluster in advance, and we can refer our written flink task directly to this cluster.

The command to start the cluster is as follows:

${FLINK_HOME} / bin/yarn-session.sh

This command has a lot of parameters, you can add-h later, I will focus on the-d parameter here.

With the addition of-d, it refers to the isolation mode, that is, it is disconnected from the client after startup. If you want to stop the cluster, you need to stop the cluster through yarn application-kill {applicationId}.

If-d is not specified, the cluster will remain connected to the client all the time, and the cluster will exit after the client exits.

Submit a task

To submit a task to the yarn session cluster, you only need to submit the task to the session cluster on the appropriate client machine via the command ${FLINK_HOME} / bin/flink run-d user.jar.

In addition, we can submit the task through the last item of web ui.

This session model is generally suitable for batch tasks, that is, tasks that can be terminated after a period of time, because for such short-time tasks, you can avoid wasting too much time in applying for resources.

After the cluster starts, no resources are allocated to the flink cluster. When the task is submitted, the yarn cluster will allocate resources to the task according to the request. After the task is completed, the system will release the corresponding resources at intervals. (this time is configurable, in case there is another task coming soon, reapply for resources.)

Yarn per job

We talked about session mode deployment cluster above, this mode can run a lot of tasks in a cluster, these tasks share the resources of flink cluster, the isolation is not very good, so flink also provides another execution mode: yarn per job mode.

This model sets up a separate cluster for each flink task on yarn, with the advantage that each task manages resources separately and is isolated from other task resources. This mode is suitable for flow tasks that are less sensitive to startup time and require a long time to run.

Start command

${FLINK_HOME} / bin/flink run-d-p 4-ys 2-m yarn-cluster-c com.example.Test userjar.jar arg1 arg2

After the submission is successful, we will see a similar task on the administration page of yarn

This startup command also has a lot of parameters, so I won't explain them one by one. I'll talk about the core parameters in vernacular.

-d uses separate mode-p program parallelism-ys each taskmanager has several slot, we can simply understand that flink will divide the taskmanager memory into several parts, under some conditions, the program can share slot, improve efficiency, as for the concept of slot, we will talk later, today will not say much. Divide parallelism by this value, and you get that flink starts several taskmanager, so to avoid having extra slot, we'd better set parallelism divided by this ys value. C program entry class, we can specify the entry class when the program is packaged, if it is not specified or there are many classes in the program, we need to specify the entry class through this-c parameter. The last argument on the command line is the parameter of the user's jar package. Stop command

First, we can stop the cluster on the flink page by stopping the flink task, and after we stop the flink task, yarn will automatically release the corresponding resources.

Second, stop through the command line:

${FLINK_HOME} / bin/flink stop-m yarn-cluster-yid application_1592386606716_0005 c8ee546129e8480809ee62a4ce7dd91d

Yarn applicationId and flink job id need to be specified at this time

Third, stop through the program

Https://blog.csdn.net/zhangjun5965/article/details/106820591

If we build a system like a real-time platform, we can't stop the task manually through the command line, we can call the corresponding api to stop the task.

Start the process

After we execute the corresponding command, the system will upload the flink jar, related configuration files and users' jar to a temporary directory of hdfs. The default is / user/ {USER} / .flink / {applicationId}, and then when we build the flink cluster, we will find a directory to get it. After the program is deployed successfully, delete the corresponding temporary directory application mode

This mode is provided in flink version 1.11. When flink's yarn per job mode starts, both the local flink jar and the user's jar will be uploaded to hdfs. This process consumes the network bandwidth very much. If multiple people submit the task at the same time, it will have an even greater impact on the network. In addition, the flink jar package is the same every time the task is submitted, and you don't have to copy it every time. Therefore, flink provides a new application mode, which can put both the jar of flink and the jar of users on hdfs in advance, which saves the work of copying jar packets of submitting tasks in yarn per job mode, saves bandwidth, and speeds up the speed of submitting tasks.

The specific commands are as follows:

. / bin/flink run-application-p 1-d-t yarn-application\

-yD yarn.provided.lib.dirs= "hdfs://localhost/data/flink/libs/"\

Hdfs://localhost/data/flink/user-lib/TopSpeedWindowing.jar

The above is how to deploy clusters in flink. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report