Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the method of indexing data in solr

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what is the method of solr indexing data". In the daily operation, I believe that many people have doubts about the method of solr indexing data. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "what is the method of solr indexing data?" Next, please follow the editor to study!

This tutorial uses solr4.8 as the test environment, and the jdk version requires version 1.7 or above.

Prepare for

This article assumes that you have a primary or intermediate level or above in Java, so you will not introduce the configuration of Java-related environments. Download and extract the solr. There is a start.jar file in the example directory. Start:

one

Java-jar start.jar

Browser access: http://localhost:8983/solr/, what you see is the management interface of solr

Index data

After the service starts, the interface you see currently does not have any data. You can add (update) documents to Solr, delete documents, and include some sample files in the exampledocs directory through the POSTing command. Run the command:

one

Java-jar post.jar solr.xml monitor.xml

The above command adds two documents to solr and opens them to see what's inside. The contents of solr.xml are as follows:

one

two

three

four

five

six

seven

eight

nine

ten

eleven

twelve

thirteen

fourteen

fifteen

sixteen

seventeen

eighteen

nineteen

twenty

SOLR1000

Solr, the Enterprise Search Server

Apache Software Foundation

Software

Search

Advanced Full-Text Search Capabilities using Lucene

OptimizedforHigh Volume Web Traffic

Standards Based Open Interfaces-XML and HTTP

Comprehensive HTML Administration Interfaces

Scalability-Efficient Replication to other Solr Search Servers

Flexible and Adaptable with XML configuration and Schema

Good unicode support: h é llo (hello with an accent over the e)

0

ten

True

2006-01-17T00:00:00.000Z

Means to add a document to the index, which is the data source used to search. You can now search for the keyword "solr" through the administrative interface. The specific steps are:

Click the Execute Query button at the bottom of the page and the query result will be displayed on the right. This result is the display result of the json format of the solr.xml that you just imported. Solr supports rich query syntax. For example, if you want to search for the keyword "Search" in the field name, you can use the syntax name:search, of course, if you search for name:xxx, there will be no results, because there is no such content in the document.

Data import

There are also a variety of ways to import data into Solr:

You can import data from a database using DIH (DataImportHandler)

CSV file import is supported, so Excel data can also be easily imported

Support for documents in JSON format

Binary documents such as Word, PDF

You can also customize the import programmatically

Update data

What happens if the solr.xml of the same document is imported repeatedly? In fact, solr will uniquely identify the document according to its field id. If the id of the imported document already exists in solr, then the document will be automatically replaced by the newly imported document with the same id. You can try it yourself and observe the changes in several parameters of the management interface before and after the replacement: Num Docs,Max Doc,Deleted Docs.

NumDocs: the number of documents in the current system, which may be greater than the number of xml files, because a xml file may have multiple tags.

It is possible that the value of maxDoc:maxDoc is larger than that of numDocs, for example, the value of maxDocs increases after repeating the same file of post.

DeletedDocs: repeating the post file will replace the old document, and the value of deltedDocs will be increased by 1, but this is only a logical deletion, not really removed from the index.

Delete data

Delete the specified document through id, or delete the matching document through a query

one

two

Java-Ddata=args-jar post.jar "SOLR1000"

Java-Ddata=args-jar post.jar "name:DDR"

At this point, the solr.xml document is deleted from the index, and the result is no longer returned when you search for "solr" again. Of course, solr also has transactions in the database. When the delete command is executed, the transaction is automatically committed, and the document is deleted from the index immediately. You can also set commit to false and commit the transaction manually.

one

Java-Ddata=args-Dcommit=false-jar post.jar "3007WFP"

After executing the above command, the document is not really deleted, or you can continue to search for the relevant results, and finally you can use the command:

one

Java-jar post.jar-

Commit the transaction and the document is completely deleted. Now re-import the file you just deleted into Solr to continue our study.

Delete all data:

one

Http://localhost:8983/solr/collection1/update?stream.body=*:*&commit=true

Delete specified data

one

Http://localhost:8983/solr/collection1/update?stream.body=title:abc&commit=true

Multiple conditional deletion

one

Http://localhost:8983/solr/collection1/update?stream.body=title:abc AND name:zhang&commit=true

Query data

The query data is obtained through the GET request of HTTP. The search keyword is specified by parameter Q. In addition, you can specify many optional parameters to control the return of information. For example, if you specify the returned field with fl, such as f1=name, then the returned data only includes the contents of the name field.

one

Http://localhost:8983/solr/collection1/select?q=solr&fl=name&wt=json&indent=true

Sort

Solr provides sorting function, specified by the parameter sort, which supports positive, reverse, or multiple field sorting.

Q=video&sort=price desc

Q=video&sort=price asc

Q=video&sort=inStock asc, price desc

By default, Solr is sorted in reverse order according to socre, and socre is a score calculated by a search record based on relevance.

Highlight

In web search, matching keywords may be highlighted in order to highlight the search results. Solr provides good support, as long as you specify parameters:

Hl=true # turn on highlight function

Hl.fl=name # specify fields that need to be highlighted

one

Http://localhost:8983/solr/collection1/select?q=Search&wt=json&indent=true&hl=true&hl.fl=features

The returned content contains:

one

two

three

four

five

"highlighting": {

"SOLR1000": {

"features": ["Advanced Full-Text Search Capabilities using Lucene"]

}

}

Text analysis

Text fields are indexed by dividing the text into words and using various conversion methods (such as lowercase conversion, plural removal, stemming). The fields are defined in the schema.xml file in the index, and these fields will play a role in it.

By default, the search for "power-shot" does not match "powershot". By modifying the schema.xml file (solr/example/solr/collection1/conf directory) and replacing the features and text fields with the "text_en_splitting" type, you can index it.

one

two

three

...

Restart solr after modification, and then import the document again

one

Java-jar post.jar * .xml

Now we can play Standard PvP match

Power-shot- > Powershot

Features:recharing- > Rechargeable

1 gigabyte-> 1G

At this point, the study on "what is the method of indexing data in solr" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report