Search for pre-service architecture 07/02 Update SLTechnology News&Howtos

Search for pre-service architecture

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

At present, the solutions related to search in the system are solved by the database's own characteristics, such as the full-text search function of mysql 5.7 to achieve resource search.

The benefit of this implementation is convenience, with no additional development and maintenance costs. But as business grows and data grows, performance and scalability bottlenecks are easy to emerge.

Combined with the opportunity of this solution library demand, it is decided to introduce spring boot + solr as the pre-platform based on the main architecture to cope with the increasingly strong search service demand.

solutions

The front-end platform is developed based on Spring Boot, mainly focusing on the micro-service idea of Spring Boot, which is convenient for development and deployment. At the same time, it can do technical warm-up for future microservice architecture.

The front-end platform built through spring boot handles non-functional requirements such as access requests, current limiting, monitoring, and security.

The search terminal combines the latest apache solr server terminal (version 6.6), uses its own smart-cn Chinese word segmentation component, and provides basic support for search services.

architecture implementation

Architecture implementation part mainly verifies the feasibility of architecture through concrete practice, without involving specific business details and actual data.

The operational description of the architecture implementation strives to be replicable.

Spring boot project creation

First make sure jdk8+ is installed natively, then go to eclipse and create a maven project.

POM files contain the following:

org.springframework.boot spring-boot-starter-parent 2.0.0.BUILD-SNAPSHOT 2.1.1.RELEASE org.springframework.data spring-data-solr ${spring.data.solr.version} org.springframework.boot spring-boot-starter-web org.springframework.data spring-data-solr

After the Maven structure is set up, it is to build your project architecture, as shown in the figure:

Application.java corresponds to the entry point of the project, and a jetty service can be started with a very simple code.

package app;import org.springframework.boot.SpringApplication;import org.springframework.boot.autoconfigure.SpringBootApplication;import sample.Example;@SpringBootApplicationpublic class Application {public static void main(String[] args) throws Exception { SpringApplication.run(Example.class, args); }}

Operation diagram:

Solr Server Installation and Configuration

First go to the official website to download the latest installation package, the current version is 6.6, download the zip package.

The unzipped folder has the following directories that you should pay attention to first:

bin: Start script directory, through which commands to start shutdown server

server: solr server directory, configuration files and jar packages and index data are all in this directory

contrib: This one contains optional packages released with the version. The Chinese participle package we use later is in it.

Since we are a validation architecture, all configurations are standalone based.

The next step is to create a core and create a new folder sample_solr in the solr-6.6.0\server\solr directory, containing the following folders

conf Configuration file directory, initial version copied from solr-6.6.0\example\example-DIH\solr\db\conf

data data file directory, manually created

core.properties manually creates a directory with the following contents:

name=sample_solr

After creating core, you need to modify the configuration file inside core.

Under the solr-6.6.0\server\solr\sample_solr\conf directory, there are three files that you modify in order

solrconfig.xml

Find the request handler for data import and modify it to

solr-data-config.xml

Configure your data source and document fields through this file. Here is my configuration.

There are a few things to watch out for,

batchSize

The official documentation suggests using-1 for mysql database to force mysql to use Integer.MIN_VALUE as fetch size. In actual operation, when I set it to-1, I use mysql-connector-java-5.1.24 (if this driver does not need to download and run to solr-6.6.0\server\lib directory) as a driver, there will be a result set close error, so set it to 10 here. Whether it is used or not is still to be verified.

Later, I found a similar problem in mysql bug list for reference.

https://bugs.mysql.com/bug.php? id=83027

PK of entity

deltaQuery will be used, although deltaQuery statement is through update_time> xxx to obtain incremental data, the actual final query is still through pk in (ids) such a way, so if you need to use deltaQuery, PK needs to be set to the database table primary key

Definition of field

Here you can see that I defined a tag field, but when you go to search, the result is only id and name.

Finally, I found that the fields in the document are all defined, but solr reserves some field definitions for its own sample, which just include id and name, which is also a rather pit place.

managed-schema

The above says that inverted field needs to be defined, so this file is used to define the fields and field types used in the document. Solr is indexed by a scheme definition.

This scheme file contains four elements:

field type

Field types such as text, numeric floating point, etc., defining appropriate types helps solr to more accurately identify fields and output results

field

Field, the basic unit used to make up a solr document. if a document is an object from an object-oriented perspective, that correspond fields are attribute of the object

dynamicField

Dynamic fields. Solr reserves some wildcard field definitions in addition to some default fields, as follows:

Combined with our tag field above, if we think it is too cumbersome to define each field, we can use dynamic field directly in entity.

After re-indexing, our search results are as follows:

copyField

From the name, you can see that this is a copy field function. From the definition, it is obvious that there are source and dest. One of the main uses is full-text search, which copies all the fields that need to be searched into one field, and then searches for this field are not full-text searches? This is the same reason as early databases that combine several fields and then a fuzzy query is full-text search.

Note that if the source has several fields, then the multiValued of the target field needs to be set to true.

For more detailed definitions, please refer to the official documentation. We are mainly concerned with the definitions of field type, field and copyField. Here we will look at them based on our examples.

field type does not need to be extended in general, because solr already has many types. Here, in order to support Chinese word segmentation, we add a new field type as follows

A field type consists of two parts, index and query. Index is responsible for the analysis of fields while query is responsible for querying. Also for Chinese participle we need to copy lucene-analyzers-smartcn-6.6.0.jar from solr-6.6.0\contrib\analysis-extras\lucene-libs to webapp\WEB-INF\lib

The field configuration is as follows:

Mainly configure the field type that needs to support Chinese to text_smart defined by us before

copyField

With the above configuration, we can run a solr server properly. Let's review our process.

Finally we go to the command line and start it with solr.cmd start

Spring solr client

Based on what we've covered in the previous two chapters, what we need to do at this point is to provide the search interface encapsulation and index maintenance for the solr server in the spring boot project.

First we need to create the configuration of solr, using two files

1， application.properties

Create a new properties file under src/main/resource directory with the following contents

spring.data.solr.host=http://127.0.0.1:8983/solr/sample_solr

2， SolrConfig.java

Inject properties defined in properties into beans through annotations

package sample;import org.springframework.boot.context.properties.ConfigurationProperties;@ConfigurationProperties(prefix = "spring.data.solr")public class SolrConfig {private String host;private String zkHost;private String defaultCollection;//getter setter}

The next step is to connect the user interface to solr script in a restful way through a controller.

Below is our query controller.

package sample;//import section@RestController@EnableAutoConfigurationpublic class Example { @Autowired private SolrClient client; @RequestMapping("/query/name/{name}") public String queryByName(@PathVariable String name) throws IOException, SolrServerException {ModifiableSolrParams params =new ModifiableSolrParams(); params.add("q",name); params.add("hl","on"); params.add("hl.fl","name,tag"); params.add("ws","json"); params.add("start","0"); params.add("rows","10"); QueryResponse response=null; try{ response=client.query(params); SolrDocumentList results = response.getResults(); for (SolrDocument document:results) { System.out.println( document); } }catch(Exception e){ e.getStackTrace(); } return response.toString(); } }

Start the spring boot service and enter our query command in the browser to get the result.

http://localhost:8080/query/name/test

other concerns

Through partial validation of the architecture implementation, the pre-service architecture based on spring boot + solr is feasible. But as a front-end platform, what else do we need to focus on?

safety

The front-end platform is often exposed outside the general firewall, so the risk factor is often very high. We mainly strengthen prevention from two aspects

current limiting

Ensure service availability and avoid possible traffic *** by throttling and appropriate service degradation. Specific implementation can refer to the idea of thread pool, not specific development.

authentication

The confidentiality and integrity of information transmission are guaranteed by authentication system combined with encryption algorithm

reliability

Reliability refers to the availability of the system, covering factors such as system uptime, downtime, service recovery time, etc. We do this through prevention and surveillance.

prevention

Remove single-point by deploying cluster services from a front-end, search-service perspective

monitoring

Monitor the front system and search server through independent monitoring system, set reasonable threshold and alarm point

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.