Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to deploy an Apache NiFi Poc environment

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article will give you a detailed explanation on how to deploy the Apache NiFi Poc environment. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.

Introduction to 1.NiFi

Apache Nifi is an easy-to-use, powerful and reliable data processing and distribution system.

Main functions: data flow management, design data flow, execute data flow, monitor data flow execution.

A data flow is a directed graph that contains: data source nodes, data transformation and coordination nodes, and data output nodes.

In NiFi, the node in the data flow chart is called Processor, the edge in the flow chart is called connection, the edge is directional, and the data flowing in the flow chart is called FlowFile. The FlowFile is created by the Processor of the data source type, flows along the connection, the Processor of the data transformation type is converted (disassembled, merged, transformed into a new FlowFile, copied or abandoned), the flow is adjusted by the coordination type Processor, and finally sent to the external node by the Processor of the Sink type.

NiFi can play a very important role in a complex multi-system enterprise environment.

two。 Environmental requirements

Apache NiFi does not pick an operating system, as long as it can install JDK and execute java commands. Both oracle jdk and open jdk are available, and the version requirement is 8 or 11. Install jdk and execute javac to see if it is installed.

Laofeng@192 ~% javac-- versionjavac 11.0.9

If you deploy an Apache NiFi pseudo-base, you need to pre-install Docker Desktop.

3. Single point

Apache NiFi installation is also relatively simple, as long as download a binary package, unzipped can be executed. There are two kinds of binary packages: tag.gz and zip. It is recommended that Mac and linux users download tar.gz,windows users to download zip packages. Apache NiFi download address: http://nifi.apache.org/download.html if the download speed does not reach a few MB/ seconds, it is recommended to change to a fast image address. After all, the size of the installation package is 1.5G.

The decompressed directory structure is as follows:

Start nifi

Start with bin/nifi.sh under linux and mac, and use bin/nifi.bat under windows.

# try it first. The output is usage. You can see start, stop, execute, restart, status, dump, diagnostics, installation (serving the system), stateless (? What do you mean) laofeng@192 nifi-1.12.1% bin/nifi.shUsage nifi {start | stop | run | restart | status | dump | diagnostics | install | stateless} # execute the startup command, deleting "java home" and "nifi home" The startup configuration file is "conf/bootstrap.conf" laofeng@192 nifi-1.12.1% bin/nifi.sh startJava home: / Library/Java/JavaVirtualMachines/jdk-11.0.9.jdk/Contents/HomeNiFi home: / Users/laofeng/Downloads/apps/nifi-1.12.1Bootstrap Config File: / Users/laofeng/Downloads/apps/nifi-1.12.1/conf/bootstrap.confWARNING: An illegal reflective access operation has occurredWARNING: Illegal reflective access by org.apache.nifi.bootstrap. Util.OSUtils (file:/Users/laofeng/Downloads/apps/nifi-1.12.1/lib/bootstrap/nifi-bootstrap-1.12.1.jar) to method java.lang.ProcessImpl.pid () WARNING: Please consider reporting this to the maintainers of org.apache.nifi.bootstrap.util.OSUtilsWARNING: Use-- illegal-access=warn to enable warnings of further illegal reflective access operationsWARNING: All illegal access operations will be denied in a future release# check the status Output the listening port number and process id "listening to Bootstrap on port 65173" PID=16224 "laofeng@192 nifi-1.12.1% bin/nifi.sh statusJava home: / Library/Java/JavaVirtualMachines/jdk-11.0.9.jdk/Contents/HomeNiFi home: / Users/laofeng/Downloads/apps/nifi-1.12.1Bootstrap Config File: / Users/laofeng/Downloads/apps/nifi-1.12.1/conf/bootstrap.conf2020-11-15 20 bin/nifi.sh statusJava home 40 Library/Java/JavaVirtualMachines/jdk-11.0.9.jdk/Contents/HomeNiFi home 05575 INFO [main] org.apache.nifi.bootstrap.Command Apache NiFi is currently running, listening to Bootstrap on port 65173, PID=16224# uses the jps command Two related processes "NIFI" and "RunNiFi" laofeng@192 nifi-1.12.1% jps16224 NiFi16222 RunNiFi were found

Use the browser to access: http://127.0.0.1:8080/nifi can see the following interface, which basically confirms that the startup is successful.

4. Pseudo-cluster cluster architecture

NiFi Cluster adopts no leader mode, that is, all cluster nodes are in the same configuration at deployment time, and there is no difference between master node and slave node. Each node has the same data flow definition, performs the same task, but processes different data. NiFi uses zooKeeper as a coordination service. When the cluster starts, one node is selected as the coordinator node, and the other nodes send heartbeat information and status reports to it. When the new node chooses to join the cluster, the new node must first connect to the cluster orchestration node to download the latest data flow. If the cluster coordinator node determines that the node is allowed to join, the current data flow will be provided to the node and the node can join the cluster, but the copy of the data flow of the new node must match the copy provided by the cluster coordinator node. If the data flow configuration version of the new node is different from that of the cluster orchestration node, the new node will be refused to join the cluster.

Terminology

Coordinator (coordinator): the NiFi cluster coordinator is a node in a NiFi cluster that performs tasks to manage which nodes are allowed in the cluster and provides up-to-date streams to newly joined nodes. When data flow managers manage data flows in a cluster, they can be managed through the user interface of any node in the cluster. Any changes you make are then replicated to all nodes in the cluster.

Nodes (nodes): each cluster consists of one or more nodes. These nodes perform actual data processing.

Primary Node (master node): each cluster has a master node. A stand-alone processor can be run on this node. ZooKeeper is used to elect the primary node. If the node is disconnected from the cluster for any reason, a new primary node is automatically elected. Users can determine which node is currently the master node by looking at the cluster management page of the user interface.

Isolated Processors (stand-alone processor): in a NiFi cluster, the same data flow runs on all nodes. Therefore, each component in the data flow runs on all nodes. In some cases, however, DFM may not want some processors to run on all nodes. The most common situation is the protocol restrictions that are used when using processors to communicate with external services. For example, the GetSFTP processor extracts from the remote directory. If the GetSFTP processor runs on all nodes in the cluster and tries to pull from the same remote directory at the same time, there may be a race problem. Therefore, DFM can configure GetSFTP on the primary node to run independently, which means that it runs only on the primary node. With the correct data flow configuration, it can pull in the data and load balance among the remaining nodes in the cluster. Note that although this feature exists, it is also common to simply use a separate NiFi instance to pull data and provide it to the cluster. This depends on the resources available and the administrator decides how to configure the cluster.

Heartbeats (heartbeat): nodes pass their health and status to the current cluster coordinator through "Heartbeats", which lets the coordinator know that they are still connected to the cluster and working properly. By default, the node emits a heartbeat every 5 seconds, and if the cluster coordinator does not receive a heartbeat signal from the node within 40 seconds (= 5 seconds * 8), the node will be disconnected due to "lack of heartbeat". These two parameters can be configured in the node.properties file. The reason the cluster coordinator disconnects the nodes is because the coordinator needs to ensure that each node in the cluster is synchronized, and if it does not receive messages from a node on a regular basis, then the coordinator cannot be sure that it is still synchronized with the rest of the cluster. If the node does send a new heartbeat after 40 seconds, the coordinator will automatically request the node to rejoin the cluster to include revalidation of the node flow. Disconnect due to lack of heartbeat and reconnection after receiving heartbeat are reported to DFM in the user interface.

Based on docker cluster

The way to achieve pseudo-clustering here is to use docker-compose to start the container of multiple nifi to form a NiFi cluster running in docker. Ignore the installation process of Docker Destop.

Docker-compose file version: "3" services: zookeeper: hostname: zookeeper container_name: zookeeper image: 'bitnami/zookeeper:latest' environment:-ALLOW_ANONYMOUS_LOGIN=yes nifi: image: "apache/nifi:1.12.1" ports:-8080 # Unsecured HTTP Web Port environment:-NIFI_WEB_HTTP_PORT=8080-NIFI_CLUSTER_IS_NODE=true-NIFI_CLUSTER_NODE_PROTOCOL_ PORT=8082-NIFI_ZK_CONNECT_STRING=zookeeper:2181-NIFI_ELECTION_MAX_WAIT=1 min-NIFI_HOME=/opt/nifi/nifi-current-NIFI_LOG_DIR=/opt/nifi/nifi-current/logs-NIFI_TOOLKIT_HOME=/opt/nifi/nifi-toolkit-current-NIFI_PID_DIR=/opt/nifi/nifi-current/run-NIFI_BASE_DIR=/opt/nifi

Save the above code as "docker-compose.yml". Judging from the details of the yml file, two images are used: bitnami/zookeeper:latest and apache/nifi:1.12.1.

The "apache/nifi:1.12.1" image uses the jdk version of "openjdk8"

Create and start a cluster

Note that the directory where the command is executed and the "docker-compose.yml" saved must be the same directory.

# start a three-node NiFi cluster. You need to download the image for the first time, which will take some time. # laofeng@192 nifi-1.12.1% docker-compose up-- scale nifi=3-d # starts downloading image Pulling zookeeper (bitnami/zookeeper:latest)... latest: Pulling from bitnami/zookeeper58212c1109c5: Pull complete081a2ae8dc51: Pull completef5ff4112905d: Pull complete35864a4b7faf: Pull completecdcc88215c01: Pull complete94a860965551: Pull complete7b37ce5d991a: Pull complete9b0fd0c439c8: Pull complete79ae9cc9ceef: Pull completef587456f2eac: Pull complete215bcd582847: Pull completec3bbf763f965: Pull complete96583be231d1: Pull completeDigest: sha256:0f278b73b82ec8910168f09343b8dc5405152482d2fac1f26473ffc12564fafaStatus: Downloaded newer image for bitnami/zookeeper:latestPulling nifi (apache/nifi:1.12. 1). 1.12.1: Pulling from apache/nifid6ff36c9ec48: Pulling fs layerd6ff36c9ec48: Pull completec958d65b3090: Pull completeedaf0a6b092f: Pull completeffba832277c8: Pull complete9687742a10f9: Pull complete438df03a4d78: Pull completeb428ea9845bb: Pull completee97cefb1594a: Pull completeb988f1230121: Pull complete066b86f87d5a: Pull complete11325722f405: Pull completeDigest: sha256:bf7576ab7ad0bfe38c86be5baa47229d1644287984034dc9d5ff4801c5827115Status: Downloaded newer image for apache/nifi:1.12.1# launches container Creating nifi-1121_nifi_1. DoneCreating nifi-1121_nifi_2... DoneCreating nifi-1121_nifi_3... DoneCreating zookeeper... Done# has been started at this point.

Use the docker ps command to view the health of the container

Laofeng@192 nifi-1.12.1% docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES6b0974257ea3 apache/nifi:1.12.1 ".. / scripts/start.sh" 7 minutes ago Up 7 minutes 8000/tcp, 8443/tcp, 10000/tcp 0.0.0.0 32770-> 8080/tcp nifi-1121_nifi_219a9fbc4ec11 bitnami/zookeeper:latest "/ opt/bitnami/script …" 7 minutes ago Up 7 minutes 2181/tcp, 2888/tcp, 3888/tcp, 8080/tcp zookeeper058e826876e0 apache/nifi:1.12.1 ".. / scripts/start.sh" 7 minutes ago Up 7 minutes 8000/tcp, 8443/tcp, 10000/tcp 0.0.0.0minutes ago Up 32769-> 8080/tcp nifi-1121_nifi_3c4c02b6415eb apache/nifi:1.12.1 ".. / scripts/start.sh" 7 minutes ago Up 7 minutes 8000/tcp, 8443/tcp, 10000/tcp, 0.0.0.0minutes ago Up 32768-> 8080/tcp nifi-1121_nifi_1

From the output of the PS command, you can see that there are four active containers: zookeeper, nifi-1121_nifi_1, nifi-1121_nifi_2, and nifi-1121_nifi_3.

There are three ports mapped to the host: 0.0.0.08080/tcp 32770-> 8080/tcp (nifi-1121_nifi_2), 0.0.0.08080/tcp 32769-> 8080/tcp (nifi-1121_nifi_1).

Each node in the nifi cluster can be used as an entrance to the WebUI, and use a browser to access one of the nodes http://localhost:32770/nifi.

Note: the port of nifi container port 8080 is mapped to the host at random. Different hosts vary from startup to boot. You need to use docker ps to check the specific mapped port number.

View cluster status

Click the menu

Pop-up menu

Cluster status

NiFi cluster management command

Cluster-summary, Cluster Overview

Nifi get-node to get the information of a single node

Nifi get-nodes, get the node list

Nifi connect-node, connecting to nod

Nifi disconnect-node, exit the node

Nifi offload-node, with offline nodes in the cluster

Nifi delete-node, which deletes nodes from the cluster

# enter the container shelllaofeng@192 nifi-1.12.1% docker exec-it c4c02b6415eb / bin/bashnifi@c4c02b6415eb:/opt/nifi/nifi-current$ cd / opt/nifi/nifi-toolkit-1.12.1nifi@c4c02b6415eb:/opt/nifi/nifi-toolkit-1.12.1$ bin/cli.sh _ Apache (_).] (_) | _. _ | | _)\ [`. -. | [|'- | | -'[| /\ | | |'[_ _ | | _] [_ _], 'CLI v1.12.1Type' 'help' to see a list of available commands | Use tab to auto-complete.Session loaded from / home/nifi/.nifi-cli.config# Cluster Overview # > nifi cluster-summary Total node count: 3Connected node count: 3Clustered: trueConnected to cluster: true# get node list # > nifi get-nodes# Node ID Node Address API Port Node Status -0 8dc6c433-68bc-4839-b49b-a8d7710b7b34 c4c02b6415eb 8080 CONNECTED 1 a30e4804-7136-4f68-a66b-f5f3b764d7f5 6b0974257ea3 8080 CONNECTED 2 184fa9f3-0595-4ab7-b07c-ddfd0b011956 058e826876e0 8080 CONNECTED # returns the status of a node There is no additional information compared to the node list command # > nifi get-node-- nifiNodeId 8dc6c433-68bc-4839-b49b-a8d7710b7b34Node ID: 8dc6c433-68bc-4839-b49b-a8d7710b7b34Node Address: c4c02b6415ebAPI Port: 8080Node Status:CONNECTED~ stop the docker cluster

Using the docker-compose stop command, you can stop the cluster container that makes up nifi from running, but the container is retained and inactive, and then you can use the docker-compose start command to restore the cluster at any time.

You must execute the command under the docker-compose.yml file path.

Stop it

Laofeng@192 nifi-1.12.1% docker-compose stopStopping nifi-1121_nifi_2... DoneStopping zookeeper... DoneStopping nifi-1121_nifi_3... DoneStopping nifi-1121_nifi_1... Done

View the container

# with docker ps, there is no active container laofeng@192 nifi-1.12.1% docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES# look at all containers, including stopped containers, and find that four containers of the nifi cluster are still there. Laofeng@192 nifi-1.12.1% docker ps-aCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES6b0974257ea3 apache/nifi:1.12.1 ".. / scripts/start.sh" About an hour ago Exited (137) 5 minutes ago Nifi-1121_nifi_219a9fbc4ec11 bitnami/zookeeper:latest "/ opt/bitnami/script..." About an hour ago Exited 5 minutes ago zookeeper058e826876e0 apache/nifi:1.12.1 ".. / scripts/start.sh" About an hour ago Exited (137) 5 minutes ago nifi-1121_nifi_3c4c02b6415eb apache/nifi:1.12.1 ".. / scripts/start. Sh "About an hour ago Exited (137) 5 minutes ago nifi-1121_nifi_1

Restore NiFi level group

# execute docker-compose start, because there is no need to create containers and virtual networks, so it is faster to start laofeng@192 nifi-1.12.1% docker-compose startStarting zookeeper. DoneStarting nifi... Donelaofeng@192 nifi-1.12.1% docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES6b0974257ea3 apache/nifi:1.12.1 ".. / scripts/start.sh" About an hour ago Up 12 seconds 8000/tcp, 8443/tcp, 10000/tcp 0.0.0.0 32771-> 8080/tcp nifi-1121_nifi_219a9fbc4ec11 bitnami/zookeeper:latest "/ opt/bitnami/script …" About an hour ago Up 12 seconds 2181/tcp, 2888/tcp, 3888/tcp, 8080/tcp zookeeper058e826876e0 apache/nifi:1.12.1 ".. / scripts/start.sh" About an hour ago Up 10 seconds 8000/tcp, 8443/tcp, 10000/tcp 0.0.0.0About an hour ago Up 32772-> 8080/tcp nifi-1121_nifi_3c4c02b6415eb apache/nifi:1.12.1 ".. / scripts/start.sh" About an hour ago Up 9 seconds 8000/tcp, 8443/tcp, 10000/tcp, 0.0.0.0About an hour ago Up 32773-> 8080/tcp nifi-1121_nifi_1

Destroy the cluster

Stop the nifi cluster and delete the container and delete the virtual network.

Laofeng@192 nifi-1.12.1% docker-compose downStopping nifi-1121_nifi_2... DoneStopping zookeeper... DoneStopping nifi-1121_nifi_3... DoneStopping nifi-1121_nifi_1... DoneRemoving nifi-1121_nifi_2... DoneRemoving zookeeper... DoneRemoving nifi-1121_nifi_3... DoneRemoving nifi-1121_nifi_1... This is the end of doneRemoving network nifi-1121_default 's article on "how to deploy an Apache NiFi Poc environment". I hope the above content can be helpful to you so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report