Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Learn the design of Elasticsearch query agent in five minutes

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Elasticsearch (ES) is an open source distributed search engine based on Lucene. Because of its excellent characteristics, such as stable, reliable, fast, easy to install and use, it has been widely used in the industry. The use of ES is mainly divided into two directions: distributed real-time file storage and distributed real-time analysis search engine.

First, why do you need a query agent

Shielding complex DSL

A second-hand trading platform uses ES, which is mainly used to support the search and analysis of commodities, users, etc. (hereinafter referred to as documents).

ES provides a complete Query DSL based on Json for the query function, which is very powerful, but it is also slightly complex, and the learning cost is not low.

Take the user whose nickname is Huaren as an example, the DSL is roughly as follows:

Json {"from": 0, "size": 20, "query": {"bool": {"must": {"multi_match": {"query": "Huaren", "fields": ["nickname", "nickname.pinyin"], "type": "best_fields", "operator": "OR", "minimum_should_match": "1" "tie_breaker": 1.0}}, "filter": {"term": {"del": 0}

If you let each business side write DSL to achieve the corresponding functions according to the needs, the workload and maintenance complexity can be imagined!

Avoid dependence and limit diffusion

ES requires client and server JDK versions to be consistent as far as possible.

ES2.x requires JDK7 or above

ES5.x requires JDK8 or above

Large number of Jar package dependencies

Other possible restrictions

After using the query agent, the business parties do not need to introduce the above dependencies and restrictions

Loose coupling and control

Shield the impact of the underlying engine changes on the online business, for example, the underlying engine needs to be upgraded or restarted occasionally. In this case, you only need to query the proxy layer to achieve master-slave switching and other mechanisms, and the engine upgrade can make the online business completely transparent.

In addition, the query agent can also shield the dangerous operations of business errors and prevent the cluster from being directly exposed to each business side, thus reducing the impact of uncertainties on the system.

Second, query the implementation of the proxy layer

Industry practice

In the industry, SQL is used as a proxy layer language to implement a set of SQL-to-DSL parsers, which is very suitable for the use of ES as DB. However, as mentioned earlier, the use scenario of a second-hand trading platform is document search, which involves the complex sorting of documents, SQL can not fully achieve the target requirements, and in the case of a large number of document attributes, it is easy to produce the problem of sentence complexity.

Scheme

For all kinds of causes and effects, our final implementation plan is as follows:

Request syntax

The statement is divided into query field and param domain, query field is the filter recall condition, and param field is the sorting parameter.

A domain is a combination of attribute fields

Fields are expressed using URL parameter syntax

Take the search for a user whose nickname is Huaren as an example. The request is as follows:

The query:from=1&size=10&nickname= kernel param: null request is automatically converted to the DSL example mentioned earlier. As you can see, by comparison, it is still very simple.

Realization logic

Supplementary note:

According to the parsing method, fields are roughly divided into: built-in fields (starting position, quantity obtained, sorting strategy, etc.) and configuration fields (string, numeric value, date, latitude and longitude, etc., which will be parsed to the corresponding index field types supported by ES)

According to the use scene, the configuration field is divided into: matching filter type, sorting parameter type, field sorting type, sorting type, secondary scoring type, and so on.

Various types of configuration fields are equipped with a configuration parser and a request processor

Processing such as field defaults, illegal field filtering and so on will be done during the processing.

The processing process generates the outline information of query as the key value of the external cache to reduce the pressure on the ES cluster.

The request is assembled into a DSL of ES after verification, parsing and processing, and the request is sent to the system to assign the ES cluster.

Sample configuration:

Yml entry.user:index: user type: user query_fields:-{face: id, type: Number, class: Long}-{face: nickname, type: StringMultiMatch, fieldName: "nickname,nickname.pinyin", _ tie_breaker: 1} order_strategys:default: boostMode: multiply scores:-type: NumberTermsFilter fieldName: label_idclass: Long values: "1141730738345" weight: 2

III. Summary

Starting from the necessity of ES query interface, this paper mainly describes the syntax design, implementation logic and brief description of ES query interface on a second-hand trading platform. There is something unreasonable in it. You are welcome to correct it.

More free technical materials and videos

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report