In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Today I will introduce to you what are the top 5 common questions about Apache NiFi. The content of the article is good. Now I would like to share it with you. Friends who feel in need can understand it. I hope it will be helpful to you. Let's read it along with the editor's ideas.
What's the difference between MiNiFi and NiFi?
MiNiFi is an agent used to collect a subset of data from sensors and devices at remote locations. The goal is to help with the "first mile collection" of data and to obtain data as close to its source as possible.
These devices can be servers, workstations, and laptops, as well as sensors, self-driving cars, machines in factories, and so on, where you want to use some of the NiFi features in MiNiFi to collect specific data. Data can be filtered, selected, and classified before it is sent to the destination. The goal of MiNiFi is to use Edge Flow Manager to manage the entire process on a large scale so that operators or IT teams can deploy different process definitions and collect any data based on business needs. Here are some details to consider:
NiFi is designed to be located in a central location, usually in a data center or cloud, to move or collect data from known external systems such as databases, object stores, and so on. NiFi should be seen as a gateway to move data back and forth between heterogeneous environments or hybrid cloud architectures.
MiNiFi runs locally on the host, performs some calculations and logical operations, and sends only the data you care about to external systems for data distribution. Of course, such a system can be NiFi, but it can also be MQTT agents, cloud provider services, and so on. MiNiFi also supports use cases where network bandwidth may be limited and the amount of data sent over the network needs to be reduced.
There are two versions of the MiNiFi agent: C + + and Java. The MiNiFi C + + option takes up very little space (a few MB of memory, very little CPU), but fewer processors are available. The MiNiFi Java option is a lightweight single-node instance of NiFi, a headless version of NiFi, which has no user interface and no clustering capabilities. However, it still requires that Java be available on the host.
If you can use Kafka as the entry point for a cluster, why use NiFi?
This is a good question, and many people who participated in my Live NiFi Demo Jam have asked it. You can determine when to use NiFi and when to use Kafka in the following ways.
Kafka is designed for flow-oriented use cases that focus on smaller files, but ingesting large files is not a good idea. NiFi is completely independent of data size, because file size has nothing to do with NiFi.
Kafka is like a mailbox that stores data in Kafka topics, waiting for the application to publish and / or use it. NiFi acts like a postman, passing data to mailboxes or other destinations.
NiFi provides a wide range of protocols (MQTT, Kafka protocol, HTTP, Syslog, JDBC, TCP / UDP, etc.) to interact during data import. NiFi is an excellent, consistent, and unique software that manages all of your data extraction. You may want to consider sending data to Kafka for use in multiple downstream applications. However, NiFi should be the gateway to data because it supports multiple protocols and can meet data requirements in the same simple drag-and-drop interface, making ROI high.
Use NiFi to securely move data to multiple locations, especially when using a multi-cloud strategy.
Kafka Connect can answer some questions, but this is not a general solution when you need complex filtering, routing, scaling, and transformation when moving data.
NiFi is also built on an extensible framework that provides users with an easy way to extend the functionality of NiFi and quickly build very custom data movement streams.
What is the best way to expose REST API for real-time data collection on a large scale?
Our customers use NiFi to expose REST API for external sources to send data to their destination. The most common protocol is HTTP.
If your goal is to get data, you can use the ListenHTTP processor in NIFi to listen on the given port of the HTTP request, and then send any data to it.
If you want to use NiFi to provide Web services, look at the HandleHTTPRequest and HandleHTTPResponse processors. By using a combination of two processors, you will receive requests from external clients through HTTP. You will be able to process the data in the request and send the custom answer / result back to the client. For example, you can use NiFi to access an external system, such as a FTP server, through HTTP. You will use two processors and make the request through HTTP. When you receive a query in NIFi, NiFi queries the FTP server to get the file and sends the file back to the client.
With NiFi, all of these unique requests can be well extended. In this use case, NiFi scales horizontally as needed and sets up a load balancer in front of the NiFi instance to balance the load between the NiFi nodes in the cluster.
Can NiFi data streams be blocked or shared based on the user's access rights and security policies?
NiFi provides a very fine-grained multi-tenancy and policy model. It is easy to set the right policy to provide NiFi in a multi-tenant environment. You can easily define multiple process groups in NiFi with different policy sets, so you have a process group dedicated to team An of use case 1 and a process group dedicated to team B of use case 2. Consider:
NiFi ensures that different teams should not access other process groups. It is easy to set using internal policies in Apache Ranger or NiFi. You can have multiple teams work on a large number of use cases in the same NiFi environment.
In a NiFi cluster, all resources are shared by all existing streams and there is no resource isolation. For example, NiFi cannot allocate 60% of the resources for use case # 1 and 40% for use case # 2. For critical use cases, most customers will have dedicated NiFi clusters to ensure that SLA is met. NiFi provides monitoring to ensure that resources are used correctly within the cluster and that alerts are issued when the cluster is too small.
In 2021, Cloudera will release a new solution that enables customers to run NiFi streams in a dedicated NiFi cluster of the right size and on K8 that is automatically scaled (up and down). This option ensures that each use case uses what you want for a period of time without affecting other use cases.
Is NiFi a good substitute for ETL and batch processing?
For some use cases, NiFi can certainly replace ETL, or it can be used in batches. However, you should consider the type of processing / transformation required by the use case. In NiFi, stream files describe how events, objects, and data flow through them. Although you can perform any conversion for each Flow File in NiFi, you may not want to use NiFi to join Flow File together based on common columns or to perform some type of window aggregation. In this case, Cloudera recommends using other solutions.
So what are your suggestions?
In the case of stream usage, the best option is to use the record processor in NiFi to send records to one or more Kafka topics. Then, based on our acquisition of Eventador, you can have Flink use Continuous SQL to do all the processing you want with the data (join the stream or perform window operations).
In the batch use case, you would think of NiFi as ELT rather than ETL (E = extract, T = transform, L = load). NiFi captures various datasets, performs the required transformations on each dataset (schema validation, format conversion, data cleanup, and so on), and then sends the dataset to a data warehouse supported by Hive. After the data is sent there, NiFi may trigger a Hive query to perform a federated operation.
I hope these answers will help you determine how to use NiFi and the data journey that it can bring to your business needs. We will host more live demonstrations through the question and answer session to cover specific topics, such as monitoring NiFi traffic and how to use NiFi to automate traffic deployment. In fact, we have a lot of questions for them to attend on NiFi!
These are all of the five common questions about Apache NiFi. For more information about what the five common questions about Apache NiFi are, you can search the previous articles or browse the following articles to learn! I believe the editor will add more knowledge to you. I hope you can support it!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.