Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use NiFi, Kafka, and HBase to build extensible processes on CDP

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the knowledge of "how to use NiFi, Kafka and HBase to build scalable processes on CDP". Many people will encounter this dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Navistar is the world's leading manufacturer of commercial trucks. With a fleet of 350000 vehicles, unplanned maintenance and vehicle failures can cause continuous disruptions in business. Navistar needs a diagnostic platform that will help them predict when vehicles need to be repaired to minimize downtime. The platform needs to be able to collect, analyze and serve more than 70 remote information processing and sensor data feeds from each vehicle in the fleet, including data for measuring engine performance, coolant temperature, truck speed and brake wear. Navistar turned to Cloudera to help build a remote diagnostic platform for IoT called OnCommand ®Connection to monitor the health of its vehicles and increase their uptime. The blog demonstrates how to use similar technologies to solve smaller problems, but is similar to the problems faced by Navistar. The data is extracted from a highly modified, high-performance Corvette (see figure 1) and shows the steps to load data from an external source, format it using Apache NiFi, push it to a stream source through Apache Kafka, and store the data using the following methods. And use Apache HBase for other related analysis.

Figure 1. 2008 Corvette and improved 6.8L engine for this particular example, the Corvette discussed has replaced all the original original engine components and used higher performance parts. The engine was removed, the housing was punched, the crankshaft and camshaft were replaced, and new pistons and connecting rods were installed to achieve the target of about 600 horsepower (see figure 2). In order to make the new engine configuration work properly, the engine software has been overhauled. When pressing the throttle becomes more intense, the unexpected result is that the car's original diagnosis and error system is no longer accurate and must be disabled.

Figure 2. Reconstruction in the middle of the engine with all the new shiny internal parts in order to capture and analyze Corvette sensor data, a path is needed to allow the data to flow from the car to the alternative analysis and diagnostic platform. The first step is to connect the laptop to the diagnostic port of the Corvette (see figure 3) to import sensor data into a cloud-based storage location. S3 is used for this project. Figure 3. The laptop connects to the diagnostic port via USB. The next step is to use the data multi-function Cloudera Data Platform CDP) to access the services needed to move the data to the final storage destination for further analysis. Using CDP Public Cloud, three Data Hub are established, each hosting a set of prepackaged open source services (see figure 4): the first setting is NiFi, which is designed to automate and manage data flow. NiFi is used to import, format, and move Corvette's data from the source to its final storage point. The next step is to set up Kafka, a real-time streaming service that provides large amounts of data as streams. Kafka provides the ability to stream data while allowing other users to choose to subscribe to the data stream. In this example, there are no subscribers. However, this is an important concept and is worth demonstrating how to set it up. The final setting is HBase, a scalable, column-oriented operational database that provides real-time read / write access. When the data is imported into HBase, Phoenix is used to query and retrieve the data. Figure 4. Data flow chart of Corvette from source to query. Using CDP to build a diagnostic platform to monitor the health and performance of Corvette is a successful exercise. Now, using NiFi and Kafka to format sensor data and stream it to HBase, advanced data engineering and processing can be performed no matter how much the dataset grows. "how to build scalable processes on CDP using NiFi, Kafka, and HBase" ends here. Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report