Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Solr7 to build a full-text index of structured csv files

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces how to use Solr7 to build a full-text index of structured csv files, the content is very detailed, interested friends can refer to, hope to be helpful to you.

The editor will show you how to use Solr to build a full-text index of csv files.

1. This test is going to generate a csv file about the size of 1GB. The data file consists of ten fields, including int, double, string, date, Chinese text and English text. Some more data types are convenient for testing. Here is the Java code to generate the data.

Https://github.com/fayson/cdhproject/blob/master/generatedata/src/main/java/com/cloudera/solr/GenerateSolrTestData.java

A total of 60W pieces of data are generated, the size of 1.1GB, and the ten fields are number,firstDouble,firstNo,secondDouble,secondNo,jarName,enText,cnText,firstTime,secondTime respectively.

Build an index

On the Solr Web page, select [Collections] on the left, and then click [Add collection]. Create a Collection

Collection created successfully

Import the prepared csv file into Solr, which is provided by post.jar that comes with Solr. Here is the use of post.jar

Referring to the help command, import the csv file into Solr and establish a full-text index using the following command

Java-Durl= http://localhost:8983/solr/test0723/update-Dtype=text/csv-Dc=test0723-jar post.jar / tmp/solr/file/data.csv

The csv file is imported successfully. The next step is to verify the query on Solr.

Perform query verification

1. Enter the query interface

two。 Query based on a single field

Number

JarName

Time field range query

3. Find according to the content in the English text

4. Search according to the content in the Chinese text

5. Use a combination of fields to find

Number in a certain time range contains records of Cloudera in English text between 1 and 10000

Among the records from number30000 to 40000, firstDouble is greater than 200and seconddouble is less than 500.

JarName begins with spark, and the Chinese text contains the record of "query"

1. Different from the dataimport method used in the previous document to import data to establish an index, this document uses the post.jar that comes with Solr to import the csv file and create an index. After query testing, this method can be used normally.

2.Solr can only use UTC format when querying with time format, and Solr can only recognize time in this format, such as 2018-03-06T02:37:02Z.

3. When using multiple conditional queries, fq can be used, and multiple search conditions can be added to fq. Range retrieval can be implemented using {}, [], and TO collocation, such as firstTime: [2018-01-01T00:00:00Z TO 2018-01-31T23:59:59Z], which represents the data of firstTime between January 1 and January 31.

4.Solr 's query page also has many parameters that can be used, such as sort can sort fields, start, rows can define the number of pages, wt can specify the format of search results, and so on.

On how to use Solr7 to build a full-text index of structured csv documents to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report