How to use Elasticsearch Search API 07/03 Update SLTechnology News&Howtos

How to use Elasticsearch Search API

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "how to use Elasticsearch Search API". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Overview of Search API

The details of API are as follows:

Public final SearchResponse sear-ch (SearchRequest searchReques-t, RequestOptions options) throws IOException

Public final void searchAsync (Sear-chRequest searchRequest, Reque-stOptions options, ActionListener li-stener)

SearchRequest class diagram:

The key attributes are described as follows:

Private SearchType searchType = SearchType.DEFAULT: search type

QUERY_THEN_FETCH

Firstly, the request is sent to the relevant shards according to the routing algorithm, and only documentId and some necessary information are returned (for example, for sorting, etc.), and then the results of each shard are aggregated and sorted. Then select the number of pieces of data that the client needs to obtain (top n). Finally, request specific document information from each shard according to doc-umentId.

QUERY_AND_FETCH

The 5.4.x version begins to be abandoned and requests data directly from each shard node. Each shard returns the number of documents requested by the client, and then aggregates and returns all to the client. The returned data is the number of client requests size * (the number of shards after routing).

DFS_QUERY_THEN_FETCH

The word frequency and correlation are calculated before the request is sent to each node, and the subsequent processing flow is the same as that of QUERY_THEN_F-ETCH. It can be seen that the document correlation of this query type is higher, but the performance is worse than QUE-RY_THEN_FETCH.

Private String [] indices: the index library to be queried.

Private String routing: route field value.

Private String preference: replication group introversion.

Private SearchSourceBuilder sour-ce: query body (rerquest body), which will be explained later.

Private Boolean requestCache: whether to enable query caching.

Private Boolean allowPartialSearc-hResults: whether partial success is allowed.

Private Scroll scroll: scrolling API (for paging)

Private int batchedReduceSize = DEFAULT_BATCHED_REDUCE_SIZE: batch merge size: default is 512

Private int maxConcurrentShardRequests = 0: the maximum value should not exceed 256.Its core meaning needs to be studied.

Private int preFilterShardSize = 128.Its core function needs to be studied.

Private String [] types: type to be queried.

Next, let's focus on several common parameters of the query API:

Timeout

The timeout of the query.

From

The offset at the beginning of the query, paging parameters, similar to the paging start of a relational database. The default value is 0.

Size

Get the number of entries in batch for paging query.

Search_type

Query type, 6.4.0 only supports QUERY_T-HEN_FETCH and DFS_QUERY_TH-EN_FETCH.

Request_cache

Query caching, if set to false, depending on the index level setting, will be explained in more detail when the index is managed by API.

Search_results

Whether partial success is allowed, for example, a query request needs to send a request to three shards, if only two shards successfully return the result and the other one fails. If set to false, overall failure will be returned. If set to true, partial results will be successful. Default is true.

Terminate after

A query is the maximum number of documents collected for each shard, and when that number is reached, the query ends ahead of time.

Batched_reduce_size

The number of shards that need to be accessed by a request should be reduced immediately on the coordination node. If a request needs to aggregate data on too many nodes, it is easy to cause memory consumption. This value can be used as a protection mechanism to control the maximum number of shards that can be accessed concurrently at the same time. The default is 512.

Note: for the three parameters search_type,request_cache and allow_partial_search_results, you must query the parameters at the url level (query str-ing parameters). If you use Rest low Le-vel API, you need to pay special attention.

URI Search

Elasticsearch supports the use of URI request mode to use Search API, but does not support querying all the parameters in the request body, which is mainly used for testing, such as using CURL query commands.

An example of URI Search is as follows:

1GET twitter/_search?q=user:kimchy

URI Search supports the following parameters:

Defines a query string whose syntax maps to the query_string of the DSL query syntax.

The default field defined when the query string does not use a field prefix.

Analyzer

The word splitter used for query strings.

Analyze_wildcard

Whether to analyze the wildcard conforms to the prefix query. The default value is false.

Batched_reduce_size

Controlling the maximum number of fragments sent by the coordinating node is mainly a protection mechanism provided by controlling the memory consumption of the coordinating node.

Default_oprator

Default operation type. Available values are and and or, and default is or.

Lenient

Whether type conversion exceptions are supported. The default is fa-sle. If a character type is passed to a numeric type, an exception will be thrown. If true is set, the exception is ignored.

Explain

Similar to the execution plan, it means that for each hit, including how the score is calculated, the default is false.

_ source

Used to filter the _ source field, you can set false to disable the return of the _ souce field. This parameter supports wildcard expressions, such as ob-j.*, for field filtering.

Stored_fields

For field filtering, which is described in detail in the field filtering section.

Sort

Sort, which can be similar to the sorting syntax of a relational database: fieldName:asc | desc, or you can use the special field _ score (for by score, default).

Track_scores

When sorting is used, the process of calculating scores in the returned results is tracked.

Track_total_hits

The default value is true, which indicates that the number of records that meet the query criteria is returned in the returned result.

Timeout

Query timeout, never timeout by default.

Terminate_after

Whether to enable early termination query is mainly to control the maximum number of documents returned from a shard in a query. If enabled, a response parameter terminate-d_early will be included in the returned result, indicating whether to terminate ahead of time.

From

For paging, the starting number of records.

Size

Used for paging, control a query, the number of records queried from each shard.

Search_type

The query type, which is described at the beginning of the article.

Allow_partial_search_results

Whether to allow partial shard execution to fail is true by default, or you can set the default value by cluster configuration parameter: se-arch.default_allow_partial_results.

That's all for the content of "how to use Elasticsearch Search API". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.