How to install and deploy flink 09/19 Update SLTechnology News&Howtos

How to install and deploy flink

2025-09-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to install and deploy flink, which has certain reference value. Interested friends can refer to it. I hope you will gain a lot after reading this article. Let Xiaobian take you to understand it together.

Flink supports running with all linux like environments such as Linux, Mac OS X and Cygwin(Windows), requiring a master node and one or more worker nodes. Before deploying and starting the flink cluster, prepare the environment. The environmental requirements for each node are:

Java 1.8.x or later is required

ssh(sshd must be started because flink scripts are used to manage remote nodes of the cluster)

if your clust environment does not meet these software requirement, install and update them promptly.

If ssh is password-free, then make sure that the installation path is the same for each cluster, which makes it easy to use flink scripts to manage clusters.

JAVA_HOME configuration is required by flink cluster, and can also be set through the env.java.home property in conf/flink-conf.yaml.

Flink Cluster Configuration

Download Flink

https://flink.apache.org/downloads.html

and then decompress it.

tar -zxfflink-1.7.1-bin-hadoop27-scala_2.11.tgz

Analysis of key points of configuration

Master selected

After decompression, you can start editing flink conf/flink-conf.yaml to configure it.

The essential point is to select the master node and configure the attribute called jobmanager.rpc.address.

memory configuration

According to your actual situation, cluster size, business busy setting reasonable jobmanager jvm memory, attributes are jobmanager.heap.mb and taskmanager.heap.mb.

The unit is MB. Of course, some clusters may have node direct memory, which may be large or small. In this way, the unified taskmanager configuration of flink will lead to the waste of physical memory of some nodes. Therefore, it is recommended to adjust it through the FLINK_TM_HEAP environment variable, which will overwrite the taskmanager.heap.mb configuration in the configuration file.

Specify the worker node

Which nodes of the cluster are configured as worker nodes also need to be explicitly specified. This configuration is very similar to hdfs. You need to edit the conf/slaves file, and then write the ip or host of all walker nodes into this file. Then you can ssh to start the cluster on all machine nodes.

Of course, some people want to raise the bar, I will not configure slaves, I want to start taskmanager one by one, can only say that there is no problem.

Single node does not need to be configured and can be started directly after decompression.

configure the case

The website shows a three-node cluster (10.0.0.1 to 10.0.03)

Again, the home path of flink must exist and be uniform at each node. The simplest way is to use NFS, but of course it can also be distributed to each node in scp.

Important Configuration Analysis

jobmanager.heap.mb: heap memory for jobmanager.

taskmanager.heap.mb: Heap memory for taskmanager.

taskmanager.numberOfTaskSlots: Number of slots per machine, officially said to be the number of CPUs. Generally, the number of slots can be set to the number of CPUs or an integer multiple of the number of CPUs.

parallelism.default : Default parallelism at task startup.

taskmanager.tmp.dirs : temporary path for taskmanager to run, multi-path can be configured, ssd is better.

Start flink cluster

The bin/start-cluster.sh script is a script that starts the entire flink cluster. When executed, it will now start a jobmanager locally, and then start taskmanager using ssh to connect to all worker nodes configured in the slaves file. Taskmanager is linked to jobmanager via the rpc port of jobmanager configured earlier, so the whole cluster starts.

The script used to stop the cluster is bin/stop-cluster.sh.

The above startup method is mainly to start a normal cluster from zero, so many times we will add machines to the cluster, what should we do at this time?

Flink provides two scripts

Add a JobManager

bin/jobmanager.sh ((start|start-foreground)[host] [webui-port])|stop|stop-all

Add a TaskManager

bin/taskmanager.shstart|start-foreground|stop|stop-all

The machine on which the command is executed is the machine you want to add to the cluster, remember.

Thank you for reading this article carefully. I hope that the article "how to install and deploy flink" shared by Xiaobian will be helpful to everyone. At the same time, I hope that everyone will support you a lot and pay attention to the industry information channel. More relevant knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.