The usage of search engine solr 07/02 Update SLTechnology News&Howtos

The usage of search engine solr

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article will explain in detail how to use the search engine solr. The content of the article is of high quality, so the editor will share it for you as a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Download and install:

1. Download address: https://lucene.apache.org/solr/ to the official website, click Download to download

two。 Decompress it after download, as shown below

Second, operation (stand-alone)

1. Run the win command window (cmd)

two。 Execute the startup command under the bin directory of solr. The default port of the solr start,solr application server is 8983. If you want to specify the port number to start, you can add the parameter-p for example: solr start-p 8888

When launched successfully, enter http://localhost:8983/solr in the browser to open, as shown below:

2. Solr common commands:

Solr start-p port number standalone version starts solr service

Solr restart-p port number restart the solr service (Note: you must have a port number to use the restart command)

Solr stop-p port number shuts down the solr service

Solr create-c name creates an instance of core (described later in the core concept)

3. Note:

If you print the java exception stack log4j2.xml file name, directory name or volume label syntax incorrectly at startup, the reason: log4j bug, solr.cmd batch processing is not done. The figure below is as follows

This mistake does not affect the use and can be ignored. You can also change all file: in the solr-7.4.0/bin/solr.cmd file to file:///.

Create a core instance

1. Introduction to core: to put it simply, core is an instance of solr. There can be multiple core under a solr service, and each core has its own index library and corresponding configuration file, so create a core before operating solr to create indexes, because the indexes are all stored under core.

two。 There are two ways to create a core. One is the command (the name created by solr create-c). After the creation is successful, the core folder you created will appear in the D:\ solr-7.4.0\ server\ solr directory, as shown below:

Another is to create a folder on the solr management page, first create a folder under D:\ soft\ solr-7.4.0\ server\ solr and name it the name of the core you want to create, then copy the conf folder to your core under the D:\ solr-7.4.0\ example\ example-DIH\ solr\ db directory, and finally create core on the solr management page, as shown below:

Created successfully, as shown below:

4. Configure managed-schema

1. Brief introduction to managed-schema

Managed-schema is used to tell solr how to build an index. His configuration revolves around a managed-schema configuration file, which determines how solr is indexed, the data type of each field, the way of word segmentation, etc. The name of the old version of schema configuration file is schema.xml, which is manually edited, but now the name of the new version of schema configuration file is called managed-schema. His configuration method is no longer manually edited but configured using schemaAPI. The official explanation is that after using schemaAPI to modify managed-schema content, there is no need to reload core or restart solr is more suitable for maintenance in production environment. If you change the configuration by manual editing and do not reload core, you may lose the configuration. The directory where the managed-schema resides is D:\ solr-7.4.0\ server\ solr\ test1\ conf, as shown below:

2. Key members of managed-schema

FieldType: defines a type for field, the most important role is to define a word splitter, which determines how to retrieve keywords from a document. This type is supported as an array structure when the multiValued attribute is true.

Analyzer: he is a sub-element of fieldType. This is the legendary word splitter. It consists of a set of tokenizer and filter.

Field: it is the field used to create the index. If you want this field to generate an index, you need to configure his indexed attribute to true,stored the true property to store the index. As shown in the following figure, each field refers to a fieldType defined by the type attribute, and the multiValued attribute is true. The field is an array, and each subscript value solr in the array creates an index.

For more information on managed-schema, please refer to http://lucene.apache.org/solr/guide/7_4/documents-fields-and-schema-design.html.

There are basically all the commonly used data types, all of which are in the managed-schema file. You can find them by searching fieldType.

Note: in general, the primary key ID does not need to be handwritten to define field tags and will be generated automatically. If manually defined, errors are reported at run time. Because managed-schema is copied, it is best to define it in the file.

Delete everything that can be deleted by the field tag to avoid conflicts with the self-defined field. The field source that defines the id will not be deleted, as shown below:

3. Schema API:

Schema API actually uses post requests to send requests with json parameters to the solr server, and all the operation contents are encapsulated in json. If the linux system directly uses the curl tool, if the windows system uses Postman, it is recommended to use Postman.

Here is an example of adding a field. The other API is listed below:

Add-field: add a new field with parameters youprovide.

Delete-field: delete a field.

Replace-field: replace an existing field withone that is differently configured.

For more API content, please refer to http://lucene.apache.org/solr/guide/7_4/schema-api.html.

4. DIH imports index data

1. Introduction to DIH:

DIH is the full name of Data Import Handler data import processor, as the name implies, this is to import data into solr. The purpose of our solr is to enable our application to query the data that users want more quickly, and the data is stored in various places in the application into xml, pdf, and relational databases. Then solr must first be able to obtain these data and establish indexes in these data to achieve the purpose of fast search. Here is a list of the most common ones we use to import index data from a relational database to solr.

two。 Create DIH

Under the core directory created by yourself, there is a db-data-config.xml file in the conf directory (for example: d:\ solr-7.4.0\ server\ solr\ test1\ conf) (how to copy the files without changes in the directory to your own directory under D:\ solr-7.4.0\ example\ example-DIH\ solr\ db\ conf), which is used to connect to the database to extract data, as shown below:

A hypertext editor opens db-data-config.xml to start editing, as shown below:

Default properties of entity:

Name (required): name is unique and identifies the entity

Processor: only required if datasource is not RDBMS. The default value is SqlEntityProcessor

Transformer: the converter will be applied to this entity, please visit the transformer section for details.

The primary key of pk:entity, which is optional, but is required when using incremental Import. It is not necessarily related to the uniqueKey defined in schema.xml, but they can be the same.

RootEntity: by default, the root entity is under the document element. If there is no root entity, the entity directly below the entity will be treated as a follower entity. Solr generates a document for each row of data returned in the database corresponding to the root entity.

Properties of SqlEntityProcessor:

Query (required): the SQL that gets all the data

DeltaQuery: only used in "incremental Import", and only incremental competitive SQL is obtained.

ParentDeltaQuery: used only in "incremental import" and only gets the competitive SQL of the parent Entity

DeletedPkQuery: used only in "incremental import" to get the competition for which the current Entity has been deleted

DeltaImportQuery: (used only in incremental Import). If this exists, it will have an effect instead of query when importing phase in incremental Import.

How Full Import works:

Execute the Query of this Entity to get all the data

Get competitive, assemble the Query of the sub-Entity for each row of data Row

Execute the Query of the child Entity to obtain the data of the child Entity.

How Delta Import works:

Look for the sub-Entity until there is none.

Execute the deltaQuery of Entity to obtain the competition for change data

Competition obtained by merging sub-Entity parentDeltaQuery

Assemble the parentDeltaQuery of the parent Row for each competing Entity

Execute parentDeltaQuery to get the competition of the parent Entity

Execute deltaImportQuery to get your own data

If there is no deltaImportQuery, assemble the Query

Restrictions:

The query of the child Entity must refer to the competition of the parent Entity

The parentDeltaQuery of a child Entity must refer to its own competition

The parentDeltaQuery of the child Entity must return the competition of the parent Entity

DeltaImportQuery must quote its own competition.

Add the configuration at the bottom of the solrconfig.xml (directory D:\ solr-7.4.0\ server\ solr\ test1\ conf) file:

Db-data-config.xml

As shown below:

Create a lib folder under the core directory you created (D:\ solr-7.4.0\ server\ solr\ test1) and copy the database-driven jar package into it. The solr-dataimporthandler-4.5.1.jar and solr-dataimporthandler-extras-4.5.1.jar under the D:\ soft\ solr-7.4.0\ dist folder are also copied (these two jar do not necessarily need to be copied. Follow up and see if the log does not report this kind of org.apache.solr.handler.dataimport.DataImportHandler. If you do, copy it).

Then restart solr, as shown below:

After a successful restart, open solr Management, select the created core and extract data. You can query the log records under D:\ solr-7.4.0\ server\ logs, as shown below:

After the execution is successful, view the results, as shown below:

Delete solr data, as shown below:

*: *

You can also use access url to delete

Delete according to ID

Http://localhost:8080/solr/update/?stream.body=

Id value & stream.contentType=text/xml;charset=utf-8&commit=true

Query according to query conditions

Http://localhost:8080/solr/update/?stream.body=

Parameter & stream.contentType=text/xml;charset=utf-8&commit=true

Enable automatic update of timer in Solr

Download solr-dataimport-scheduler.jar and copy it to D:\ solr-7.4.0\ server\ solr-webapp\ webapp\ WEB-INF\ lib, as shown below:

Find web.xml (D:\ solr-7.4.0\ server\ solr-webapp\ webapp\ WEB-INF) and add it to the first servlet tag

Org.apache.solr.handler.dataimport.scheduler.ApplicationListener

As shown below:

Close the conf folder in the D:\ solr-7.4.0\ server\ solr directory, and then create the dataimport.properties file under conf, as shown below:

Edit the dataimport.properties file

# dataimport scheduler properties # # tosync or not tosync # 1-active Anything else-inactive# the configuration here does not need to be modified from syncEnabled=1 # which cores to schedule# ina multi-core environment you can decide which cores you want syncronized# leave empty or comment it out if using single-core deployment# to the core you use. Here is my custom core:test1syncCores=test1 # solr server name or IP address# [defaults to localhost if empty], which generally means that localhost will not change the tomcat port of server=localhost # solr server port# [defaults to 80 if empty] # install solr If you use the default port, you don't need to change it, otherwise you can change it to your own port port=8983 # application name/context# [defaults to current ServletContextListener's context (app) name] # webapp=solr # URL params [mandatory] # remainder of URL# here is changed to the following form. The link # command=delta-import requested when solr synchronizes data is an incremental extraction. Command=full-import is full extraction # params=/dataimport?command=delta-import&clean=false&commit=true params=/dataimport?command=full-import&clean=false&commit=true # schedule interval# number of minutes between two runs# [defaults to 30 if empty] # here is to set a scheduled task in minutes, that is, how often do you check data synchronization and modify it according to project requirements # to start the test in order to see the results easily Time can be set to shorten the time interval for interval=1 # reindexing (in minutes). Default is 7200, that is, 5 days. # null, 0, or commented out: indicates never redo index reBuildIndexInterval=7200 # redo index parameter reBuildIndexParams=/select?qt=/dataimport&command=full-import&clean=true&commit=true # start time of redo index interval, first real execution time = reBuildIndexBeginTime+reBuildIndexInterval*60*1000 # two formats: 2012-04-11 03:10:00 or 03:10:00, the latter will automatically complete the date part as the date when the service starts reBuildIndexBeginTime=03:10:00

After the restart, the dataimport.properties file comments will be gone, as shown below:

Then restart solr, and then the timing is enabled.

6. Integrate into the project

Maven introduces jar. Note: if there is a special need to introduce httpclient, it should be noted that the version of httpclient is compatible with the version of solr, and the httpclient of solr7.4 must not be less than 4.5.3.

Org.apache.solr solr-solrj 7.4.0

Spring-solr.xml

Increase in ApplicationContext.xml in spring

Easy to use in Java

Package com.solr;import org.apache.solr.client.solrj.SolrQuery;import org.apache.solr.client.solrj.SolrServerException;import org.apache.solr.client.solrj.impl.HttpSolrClient;import org.apache.solr.client.solrj.response.QueryResponse;import org.apache.solr.client.solrj.response.UpdateResponse;import org.apache.solr.common.SolrDocument;import org.apache.solr.common.SolrDocumentList;import org.apache.solr.common.SolrInputDocument;import java.io.IOException;public class SolrTest {private static HttpSolrClient httpSolrClient / * add or update * / public void save () throws IOException, SolrServerException {SolrInputDocument document = new SolrInputDocument (); / / Note that it is SolrInputDocument instead of SolrDocument / / you can specify id for the document, that is, addField ("id", "primary key id"). If the id is the same, the operation is update. Document.addField ("id", "123"); document.addField (" name "," 54151 "); / / key has always constructed type in solr into an array structure document.addField (" type ", 432434); document.addField (" type ", 432434); UpdateResponse responseAdd = httpSolrClient.add (document); / / it is not the same here, unlike previous versions of submit httpSolrClient.commit () / / submit System.out.println ("save successful");} / * * query * / public void query () throws IOException, SolrServerException {/ / declare query object and set query condition SolrQuery query = new SolrQuery (); query.set ("Q", "id:123"); / / execute query QueryResponse response = httpSolrClient.query (query) / / get query results SolrDocumentList documentList = response.getResults (); for (SolrDocument document: documentList) {System.out.println ("query to name:" + document.get ("name")); System.out.println ("ID:" + document.get ("id")); System.out.println () }} / * Delete * / public void delete () throws IOException, SolrServerException {httpSolrClient.deleteByQuery ("id:123"); / / this is deleted according to query conditions, or httpSolrClient.commit () can also be deleted according to id; System.out.println ("deleted successfully") }} on the use of the search engine solr to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.