In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains "how to use ElasticSearch query". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn how to use ElasticSearch query.
Search api
There are two ways to perform a search: the first is to send search parameters through REST request Uri, and the second is to send search parameters through REST request body. The second method allows you to define richer query parameters in json format. We will use the first method to do an example, but then we will exclusively use the second method to do the experiment.
_ search returns all documents indexed by bank
GET / bank/_search?q=*&sort=account_number:asc
Qnotify *: query all
The returned result is as follows:
{"took": 82, "timed_out": false, "_ shards": {"total": 5, "successful": 5, "failed": 0}, "hits": {"total": 1000, "max_score": null, "hits": [{"_ index": "bank", "_ type": "account" "_ id": "0", "sort": [0], "_ score": null, "_ source": {"account_number": 0, "balance": 16623, "firstname": "Bradshaw", "lastname": "Mckenzie", "age": 29, "gender": "F", "address": "244 Columbus Place", "employer": "Euron", "email": "bradshawmckenzie@euron.com", "city": "Hobucken" "state": "CO"}, {"_ index": "bank", "_ type": "account", "_ id": "1", "sort": [1], "_ score": null, "_ source": {"account_number": 1, "balance": 39225, "firstname": "Amber", "lastname": "Duke", "age": 32 "gender": "M", "address": "880Holmes Lane", "employer": "Pyrami", "email": "amberduke@pyrami.com", "city": "Brogan", "state": "IL"},...]}}
Number of milliseconds it took took-elasticsearch to perform a search
Timed_out- told us whether this search timed out.
_ shards- tells us how many shards were searched, and the number of successful and failed shards.
Document results returned by hits- search
How many results did hits.total hit?
Sort- search collation, or if it is not available, sort by relevance
_ score and max_score ignore this parameter for the time being (document score, reflecting relevance)
Use the request body method to perform the same search operation as above:
GET / bank/_search {"query": {"match_all": {}}, "sort": [{"account_number": "asc"}]}
The query field describes the content of the query
Match_all is the type of match that performs the query operation, and match_all means to match all documents.
The difference is that we use the request body in json format instead of the Q _ request * parameter in _ search api uri.
It is important to know that when we receive the return result, elasticsearch has fully processed the request and will not maintain any server resources or cursors in your return result. This is in sharp contrast to platforms like sql, where sql allows you to fetch some of the previous data and then continuously fetch the rest of the data through the server-side cursor.
Basic grammar
In addition to the query parameter, we can also pass other parameters to affect the query results. In the example in the previous chapter, we passed a sort parameter, and here we passed a size
GET / bank/_search {"query": {"match_all": {}}, "size": 1}
Note that if size is not specified, the default value is 10. 0.
The following example matches all documents and returns documents 11-20:
GET / bank/_search {"query": {"match_all": {}}, "from": 10, "size": 10}
The from parameter specifies which document to return from, and the size parameter specifies how many documents are returned. This feature is very important for paging. If from is not specified, the default value is 0.
The following example queries the top 10 results by account balance in reverse order (default size size):
GET / bank/_search {"query": {"match_all": {}}, "sort": {"balance": {"order": "desc"}} match query
Now that I've seen some basic query parameters, let's dig up more information about query DSL. First look at the fields that return the document. By default, all fields are returned. The original content of the document is called the source (corresponding to the _ source key in hits). If we do not want to return all the fields of the document, we can also ask the interface to return only some of the fields.
The following example shows how to return the fields account_number and balance (in _ source):
GET / bank/_search {"query": {"match_all": {}}, "_ source": ["account_number", "balance"]}
Note that the above operation only reduces the number of fields returned, but the _ source field still exists, only the account_number and balance fields are returned
If you have ever studied sql, the above concept is similar to sql select filed list from.
Now, let's move on to the query syntax. Previously, we have seen the use of match_all query types to match all documents. Now, let's introduce a new query type, match, which searches based on fields (that is, search by matching a specific field or set of fields)
The following example returns a document whose account number number is 20:
GET / bank/_search {"query": {"match": {"account_number": 20}
The following example returns an account document with all address fields containing mill
GET / bank/_search {"query": {"match": {"address": "mill"}
The following example returns an account with "mill" or "lane" in the address field
GET / bank/_search {"query": {"match": {"address": "mill lane"}} match_phrase query (phrase match)
The following example is a variant of match (match_phrase), which returns all accounts whose address field contains the phrase "mill lane"
GET / bank/_search {"query": {"match_phrase": {"address": "mill lane"}} bool query
Now, let me introduce the Boolean query. Bool queries allow us to merge multiple match queries into a single query.
The following example merges two match queries and returns all accounts with both "mill" and "lane" in the address field
GET / bank/_search {"query": {"bool": {"must": [{"match": {"address": "mill"}}, {"match": {"address": "lane"}]}
The bool must clause of the above example specifies that the matching document is considered to match only when all queries return true.
In contrast, the following example merges two match queries and returns all accounts whose address fields contain "mill" or "lane"
GET / bank/_search {"query": {"bool": {"should": [{"match": {"address": "mill"}}, {"match": {"address": "lane"}]}
In the above example, the bool should clause is considered a match as long as the document satisfies one of the queries.
The following example merges two match queries and returns all accounts with neither "mill" nor "lane" in the address field:
GET / bank/_search {"query": {"bool": {"must_not": [{"match": {"address": "mill"}}, {"match": {"address": "lane"}]}
In the above example, the bool must_ not clause specifies that a matching document is considered when all queries are not satisfied.
We can also combine must,should,must_not into bool clauses at the same time. In addition, we can also combine bool into any bool clause to implement complex multi-layer bool clause nesting logic.
The following example returns all accounts that are 40 years old but do not live in ID State:
GET / bank/_search {"query": {"bool": {"must": [{"match": {"age": "40"}}], "must_not": [{"match": {"state": "ID"}]}} filter query
In the previous section, we skipped one small detail: the document score (the _ score field of the search results). The document score is a numeric value that represents the relevance estimation of the keyword and the content of the document. The higher the document score, the higher the relevance, the less the document score, the less the relevance.
However, queries do not always need to generate document scores, especially when filtering document collections. To avoid unnecessary calculation of document scores, Elasticsearch examines the situation and automatically optimizes the query.
The bool query introduced in the previous section also supports the filter clause, which allows you to use a query statement to filter the matching results of other clauses without changing the score of the document. Let's introduce the range query and use it as an example, which allows us to filter the document by a range value. It is commonly used for number or date filtering.
This example uses a bool query to return all documents with account balances between 20000 and 30000. In other words, we need to find accounts with a balance greater than 20000 and less than 30000:
GET / bank/_search {"query": {"bool": {"must": {"match_all": {}}, "filter": {"range": {"balance": {"gte": 20000 "lte": 30000} {"query": {"bool": {"must": {"match_all": {}}, "filter": {"range": {"balance": {"gte": 20000 "lte": 30000}}
Careful analysis of the above example shows that the bool query consists of a match_all query (query section) and an range query (filter section). We can also replace the query and filter statements with any other query statement. For the above example, because all documents are within the specified scope, they are equally in a sense, that is, their relevance is the same (filter clause query, does not change the score).
In addition to match_all,match,bool,range queries, there are many kinds of queries, but we will not cover them here. From now on, we already have a basic understanding of queries, and it should not be difficult to apply what we have learned to other query types.
Term query GET / bank/_search {"query": {"term": {"address": "789 Madison"}} match exact match GET / bank/_search {"query": {"match": {"address.keyword": "789 Madison"}} term,match,match_phase,query_string,keyword exact match difference 1. Match
Match: fuzzy matching, you need to specify the field name, but the input will split the word, for example, "hello world" will be split into hello and world, and then match. If the field contains hello or world, or all the results will be queried, that is, match is a partially matching fuzzy query. The query conditions are relatively loose.
2. Term
Term: this kind of query is equivalent to match sometimes. For example, if we query a single word hello, the result will be the same as that of match, but if we query "hello world", the result will be very different, because this input will not carry out word segmentation, that is to say, when querying, the query field contains the word "hello world" rather than the word "hello world" in the query field. Elasticsearch will segment the contents of the field, "hello world" will be divided into hello and world, there is no "hello world", so the query result here will be empty. This is also the difference between term queries and match.
3. Match_phase
Match_phase: the input will be segmented, but all the participles should be included in the result, and the order should be the same. Take "hello world" as an example, it requires that hello and world must be included in the result, and that they are connected and the order is fixed. Hello that word does not meet the conditions, and world hello does not meet the conditions.
4. Query_string
Query_string: similar to match, but match needs to specify a field name, query_string searches in all fields, and the scope is more extensive.
5. Keyword exact matching
Unlike term,keyword exact matching requires that the fields of the query must be equal to
Generally, we use match for full-text search fields and term for other non-text fields.
Perform aggregation (Executing Aggregations)
The aggregation function can group and count your data. In the simplest terms, it is equivalent to the aggregate function of the sql group by statement and sql. With elasticsearch, you can return the data to be queried and the results of multiple aggregate operations for that data in one request at the same time. It makes sense to query data and perform multiple aggregation operations at the same time in a single request, which can reduce the number of network requests.
The following example groups the contents of the state field and sorts them in reverse order according to the number of documents in each group, returning the top 10 (default) groups of data with the largest number of data:
GET / bank/_search {"size": 0, "aggs": {"group_by_state": {"terms": {"field": "state.keyword"}
The above aggregation operation is equivalent to executing the following sql:
SELECT state, COUNT (*) FROM bank GROUP BY state ORDER BY COUNT (*) DESC
Return the result (only part of it is shown):
{"took": 29, "timed_out": false, "_ shards": {"total": 5, "successful": 5, "failed": 0}, "hits": {"total": 1000, "max_score": 1000, "hits": []} "aggregations": {"group_by_state": {"doc_count_error_upper_bound": 20, "sum_other_doc_count": 770, "buckets": [{"key": "ID", "doc_count": 27}, {"key": "TX", "doc_count": 27} {"key": "AL", "doc_count": 25}, {"key": "MD", "doc_count": 25}, {"key": "TN", "doc_count": 23}, {"key": "MA", "doc_count": 21} {key ":" NC "," doc_count ": 21}, {" key ":" ND "," doc_count ": 21}, {" key ":" ME "," doc_count ": 20}, {" key ":" MO " "doc_count": 20}]}
We can see that there are 27 accounts in ID State, 27 accounts in TX State and 25 accounts in AL State.
Notice that we set up size=0 because I don't need to query the document, just the aggregate results.
Based on the above example, the following example calculates the average balance of accounts in each state in addition to grouping:
GET / bank/_search {"size": 0, "aggs": {"group_by_state": {"terms": {"field": "state.keyword"}, "aggs": {"average_balance": {"avg": {"field": "balance"}
Notice how we embed the average_balance aggregation into the group_by_state aggregation. This model is suitable for all aggregations. You can repeat nested aggregation clauses to aggregate your data according to your needs.
Based on the above example, we added a restriction to sort in reverse order by the average balance of accounts in each state (indicating that the upper layer aggregation can use the results of the next layer aggregation):
GET / bank/_search {"size": 0, "aggs": {"group_by_state": {"terms": {"field": "state.keyword", "order": {"average_balance": "desc"}} "aggs": {"average_balance": {"avg": {"field": "balance"}
The following example shows how we can be grouped by age group (20-29, 30-39, 40-49), then by sex, and finally get the average account balance for each gender in each group. (for example, the average account balance of female users aged 20-29)
GET / bank/_search {"size": 0, "aggs": {"group_by_age": {"range": {"field": "age", "ranges": [{"from": 20, "to": 30}, {"from": 30 "to": 40}, {"from": 40, "to": 50}]}, "aggs": {"group_by_gender": {"terms": {"field": "gender.keyword"} "aggs": {"average_balance": {"avg": {"field": "balance"}
As a final example, find out all age distributions, and the average salary of gender M and F in these age groups, as well as the overall average salary of this age group.
GET / bank/_search {"size": 0, "aggs": {"aggAgg": {"terms": {"field": "age"}, "aggs": {"ageBalanceAvg": {"avg": {"field": "balance"} "genderAgg": {"terms": {"field": "gender.keyword"} "aggs": {"avgAvg": {"avg": {"field": "balance"} Thank you for reading The above is the content of "how to use ElasticSearch query". After the study of this article, I believe you have a deeper understanding of how to use ElasticSearch query, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.