In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the relevant knowledge of "how to use Elasticsearch Search API". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Overview of Search API
The details of API are as follows:
Public final SearchResponse sear-ch (SearchRequest searchReques-t, RequestOptions options) throws IOException
Public final void searchAsync (Sear-chRequest searchRequest, Reque-stOptions options, ActionListener li-stener)
SearchRequest class diagram:
The key attributes are described as follows:
Private SearchType searchType = SearchType.DEFAULT: search type
QUERY_THEN_FETCH
Firstly, the request is sent to the relevant shards according to the routing algorithm, and only documentId and some necessary information are returned (for example, for sorting, etc.), and then the results of each shard are aggregated and sorted. Then select the number of pieces of data that the client needs to obtain (top n). Finally, request specific document information from each shard according to doc-umentId.
QUERY_AND_FETCH
The 5.4.x version begins to be abandoned and requests data directly from each shard node. Each shard returns the number of documents requested by the client, and then aggregates and returns all to the client. The returned data is the number of client requests size * (the number of shards after routing).
DFS_QUERY_THEN_FETCH
The word frequency and correlation are calculated before the request is sent to each node, and the subsequent processing flow is the same as that of QUERY_THEN_F-ETCH. It can be seen that the document correlation of this query type is higher, but the performance is worse than QUE-RY_THEN_FETCH.
Private String [] indices: the index library to be queried.
Private String routing: route field value.
Private String preference: replication group introversion.
Private SearchSourceBuilder sour-ce: query body (rerquest body), which will be explained later.
Private Boolean requestCache: whether to enable query caching.
Private Boolean allowPartialSearc-hResults: whether partial success is allowed.
Private Scroll scroll: scrolling API (for paging)
Private int batchedReduceSize = DEFAULT_BATCHED_REDUCE_SIZE: batch merge size: default is 512
Private int maxConcurrentShardRequests = 0: the maximum value should not exceed 256.Its core meaning needs to be studied.
Private int preFilterShardSize = 128.Its core function needs to be studied.
Private String [] types: type to be queried.
Next, let's focus on several common parameters of the query API:
Timeout
The timeout of the query.
From
The offset at the beginning of the query, paging parameters, similar to the paging start of a relational database. The default value is 0.
Size
Get the number of entries in batch for paging query.
Search_type
Query type, 6.4.0 only supports QUERY_T-HEN_FETCH and DFS_QUERY_TH-EN_FETCH.
Request_cache
Query caching, if set to false, depending on the index level setting, will be explained in more detail when the index is managed by API.
Search_results
Whether partial success is allowed, for example, a query request needs to send a request to three shards, if only two shards successfully return the result and the other one fails. If set to false, overall failure will be returned. If set to true, partial results will be successful. Default is true.
Terminate after
A query is the maximum number of documents collected for each shard, and when that number is reached, the query ends ahead of time.
Batched_reduce_size
The number of shards that need to be accessed by a request should be reduced immediately on the coordination node. If a request needs to aggregate data on too many nodes, it is easy to cause memory consumption. This value can be used as a protection mechanism to control the maximum number of shards that can be accessed concurrently at the same time. The default is 512.
Note: for the three parameters search_type,request_cache and allow_partial_search_results, you must query the parameters at the url level (query str-ing parameters). If you use Rest low Le-vel API, you need to pay special attention.
URI Search
Elasticsearch supports the use of URI request mode to use Search API, but does not support querying all the parameters in the request body, which is mainly used for testing, such as using CURL query commands.
An example of URI Search is as follows:
1GET twitter/_search?q=user:kimchy
URI Search supports the following parameters:
Q
Defines a query string whose syntax maps to the query_string of the DSL query syntax.
Df
The default field defined when the query string does not use a field prefix.
Analyzer
The word splitter used for query strings.
Analyze_wildcard
Whether to analyze the wildcard conforms to the prefix query. The default value is false.
Batched_reduce_size
Controlling the maximum number of fragments sent by the coordinating node is mainly a protection mechanism provided by controlling the memory consumption of the coordinating node.
Default_oprator
Default operation type. Available values are and and or, and default is or.
Lenient
Whether type conversion exceptions are supported. The default is fa-sle. If a character type is passed to a numeric type, an exception will be thrown. If true is set, the exception is ignored.
Explain
Similar to the execution plan, it means that for each hit, including how the score is calculated, the default is false.
_ source
Used to filter the _ source field, you can set false to disable the return of the _ souce field. This parameter supports wildcard expressions, such as ob-j.*, for field filtering.
Stored_fields
For field filtering, which is described in detail in the field filtering section.
Sort
Sort, which can be similar to the sorting syntax of a relational database: fieldName:asc | desc, or you can use the special field _ score (for by score, default).
Track_scores
When sorting is used, the process of calculating scores in the returned results is tracked.
Track_total_hits
The default value is true, which indicates that the number of records that meet the query criteria is returned in the returned result.
Timeout
Query timeout, never timeout by default.
Terminate_after
Whether to enable early termination query is mainly to control the maximum number of documents returned from a shard in a query. If enabled, a response parameter terminate-d_early will be included in the returned result, indicating whether to terminate ahead of time.
From
For paging, the starting number of records.
Size
Used for paging, control a query, the number of records queried from each shard.
Search_type
The query type, which is described at the beginning of the article.
Allow_partial_search_results
Whether to allow partial shard execution to fail is true by default, or you can set the default value by cluster configuration parameter: se-arch.default_allow_partial_results.
That's all for the content of "how to use Elasticsearch Search API". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.