In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "how to use Kettle to dump interface data". In daily operation, I believe many people have doubts about how to use Kettle to dump interface data. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts about "how to use Kettle to dump interface data". Next, please follow the editor to study!
Use Kettle to dump interface data 1. Project background 1.1. Project background data interface
API: the abbreviation of Application Program Interface (Application Program Interface) is a tool to realize data communication between computer software. At the same time, API is also a kind of middleware, which provides data sharing for various platforms. Under the background of the development of big data and the Internet of things, a large number of data interfaces have been provided or excavated for developers to use and applied to every detail of life. The purpose of this paper is to describe the monitoring, calling and dumping of data interface by kettle. More detailed data interface content is not specified here.
Project background
The purpose of this paper is to provide a method to help students master the call and dump of Kettle for data interfaces. The configuration methods and components of Kettle are diverse, which requires a lot of thinking and exploration.
2. Kettle configuration
This article uses Kettle version 7.0. the interface called is Baidu development platform-Baidu Map API, and the returned data set is in JSON format or XML format. This time, two ways are used to store the data, one is exported to EXCEL, and the other is exported to a relational database for storage.
2.1. Configuration overview of Kettle conversion
Overview
Check interface information
We first get the address of WebAPI to test whether the connection is smooth. Here, I use the Place Suggestion API of Baidu Map, enter the location that the city needs to retrieve, and return the recommended value. The results are as follows:
Http://api.map.baidu.com/place/v2/suggestion?query=%E6%98%A5%E7%86%99%E8%B7%AF®ion=%E6%88%90%E9%83%BD%E5%B8%82&output=json&ak=n0lHarpY3QZx6xXXIaWMFLxj
Here we test the connectivity of the interface URL.
There is no identity verification in my interface here, you only need to enter the AK of the application. The test is successful and the returned value is in the format of the JSON string.
Interface information access
To create a new transformation, you can choose EXCEL access, text access, generate record, or table input, depending on your own situation. Here, because it is just a test, I choose to generate a record, and the parameter values are fixed. Later, you can use "${}" to replace the parameters.
Using HTTP client to parse data
To add the HttpClient component, we need to use it as a client to parse API, similar to a browser.
We choose to get the URL address here from the previous generation record. Pay attention to the setting of the character set, otherwise the interface data obtained later will be garbled.
Judge whether the data is obtained or not
A filter record component is added here to determine whether the data was successfully obtained from the HTTP client.
Parsing JSON string
Add the JSON input component, where the result obtained from the API from the stream is taken as the source field.
Parse the JSON string into multiple fields. For the specific path, you need to go through the format of the JSON string and grasp the information in advance.
Parsing JSON strings with nested loops
For the JSON string we use this time, the actual data we need is actually in the JSON string embedded in the RESULT field, so we need to parse the RESULT field again.
Output data
We use the method of EXCEL output here, and there is no special emphasis on EXCEL output. The main thing to note is that here we only output the internal address information we need, because other fields will also be inherited in the stream. Here, you can choose the output field reasonably.
Error handling
Generally speaking, error handling is only for recording, or writing to a log, or writing to a log table, or sending an error message to an administrator's mailbox.
There is no specific explanation here, the specific choice depends on the needs of the project.
Running result
F9 can perform the transformation. If there is no error in the conversion, it will enter the above line and output to the EXCEL table.
The deployment of EXCEL is shown below:
3. Other
This attachment lists some problems and solutions that may be encountered in the configuration process.
3.1. Common error messages
It is inevitable to encounter some mistakes in the course of the experiment. I have provided some mistakes that are easy to encounter here for your reference.
Chinese garbled code
Solution: when using HTTP client access, you need to choose the correct character set. It is generally right to use UTF-8, which is compatible with most characters. When you output a text file, you need to choose the correct format of the output and try not to do coding conversion at this step.
Invalid API interface
Solution: when parsing the URL path of the API interface, it is inevitable that the parameters will be accompanied by Chinese characters, and the Chinese characters here need to be transcoded in advance. Here we will teach you a little trick: use the Chome browser, paste the URL link into it, and the browser will automatically help us convert the code. At this point, paste out the URL address.
At this point, the study on "how to use Kettle to dump interface data" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
Table A.2 data node configuration parameters
© 2024 shulou.com SLNews company. All rights reserved.