In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Today, I will talk to you about the analysis and principle of ElasticSearch, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.
1. Introduction to ElasticSearch Cluster
a. What is ElasticSearch?
1. Concept:
Index (index): is where ElasticSearch stores data
Document: is the main entity stored in ElasticSearch
Document types: document types can distinguish between different objects
Nodes and clusters: ElasticSearch supports running on multiple servers that work together
Sharding: when the computing power or hardware limitation of a node is insufficient, the data can be sliced. Each part is a separate Apache Lucene index, called shard.
Replica: in order to improve query throughput or achieve high availability, you can enable a fragmented copy, which is an exact copy of the original shard
two。 Status View:
Http://localhost:9200/
Http://localhost:9200/_cluster/health?pretty
3. Operation: manipulate data through REST, GET, POST, PUT, DELETE
Second, search data
a. The process of query and indexing
1. Indexing process: the process of preparing a document that is weighed to ES and storing it in the index
two。 Search process: the process of matching documents that meet the query criteria
3. Analysis process: the process of preparing the contents of a field and converting it into a term that can be written into a Lucene index
Lexicalization: the input text is converted into an entry stream by the word splitter.
Filtering: several filters process entries in a stream of entries
4. Parser: a word splitter with zero or more filters
b. Query ElasticSearch
1. Enclose multiple simple queries as a JSON format object and send them to ElasticSearch, called query DSL
two。 Syntax:
Curl-XGET 'localhost:9200/library/book/_search?q=title:crime&pretty=true'
Curl-XGET 'localhost:9200/library/book/_search?pretty=true'-d' {"query": {"term": {"title": "crime"}}'
Curl-XGET 'localhost:9200/library/book/_search?pretty=true'-d @ query.json
c. Basic query
1.term: matches a document with an entry in a given field
2.terms: matches documents containing certain lexical items
3.match: extract hard-to-write values in parameters, analyze these values, and build appropriate queries based on them
4.multi_match: similar to match, except that fields configuration can act on multiple fields
5.query_string: supports all query syntax of Apache Lucene
A simplified version of the 6.field:query_string query
7.ids: filter the returned document to get only the document containing the specified identifier, which is used in the _ uid field
8.prefix: find a document where a field begins with a given prefix
9.fuzzy_like_this: query all documents that are similar to a given content, based on fuzzy strings, and select the best distinguishing terms generated by them
10.fuzzy_like_this_field: similar to fuzzy_like_this, except that it only acts on a single field and does not support the fields attribute.
11.fuzzy: the third kind of fuzzy query obtains the result by calculating the editing distance between the given term and the document, which consumes CPU resources and is useful for scenarios that need fuzzy matching.
12.match_all: a query that matches all documents in the index
13.wildcard: allows us to use the characters * and? in the content we want to query, which is very similar to term in the query body, with poor performance
14.more_like_this: wait for a document that is similar to the text provided
15.more_like_this_field: similar to more_like_this, except that it only acts on half a single field and does not support the fields attribute.
16.range: you can find documents on numeric and string fields within a certain range, acting only on a single field, and the query parameters are encapsulated in the field name.
d. Filter query results
1. Add the filter field under the query property to use the filter in any search
2.range: limit the search to documents whose field values are within a given limit
3.exists: select only documents with specified fields
4.missing: in contrast to exists, you can also specify which values are treated as null values
5.script: filter the document using a calculated value
6.type: returns all documents of the specified type
7.limit: limits the number of documents returned for each shard of a given query
8.ids: suitable for scenarios where some specific documents need to be filtered
9.bool, and, or and not can be combined with filters
10. Use "_ name" to name the filter
e. Compound query
1.bool:should can match or not match, must must match, must_no must not match
2.boosting: encapsulates two queries together and reduces the score of the document returned by one of the queries
3.constant_score: used to encapsulate another query (filter). Each document returned by a closed query (filter) gets a constant score, allowing us to strictly control the score assigned to each document matched by the query or filter.
4.indices: useful when you need to execute a query on multiple indexes
5.custom_filters_score: allows us to encapsulate a query and several filters
6.custom_boost_factor: allows us to encapsulate another query and multiply the score of the document returned by that query by a specified factor
7.custom_score: customize the score for another query through script
f. Data sorting
1. "sort": [{"_ score": "desc"}], default is the one with the highest score
g. Use script
1.script: contains script code; lang: indicates the language used by the script; default mvel;params: object containing parameters
two。 Available objects: doc, which accesses the current document found based on calculated scores or field values; _ source, which accesses the source of the current document and the values defined in it; and _ fields, which accesses the values of fields in the document
III. Extended structure and search
1. Turn off dynamic mapping: dynamic:false
two。 Spatial index: geo_point
Fourth, search optimization
The weight of 1.boost affects the sorting result
two。 Synonym filter synonym
3. Span query: span_term, span_first, span_near, span_or, span_not, which refers to the location of the entry that begins and ends in a field
V. combined indexing, analysis and search
1. Parent-child mapping: _ parent
two。 Get data from other systems: river
VI. Outside of search
1. Statistics: query statistics, filter statistics, terms statistics, range statistics, histogram statistics, statistical statistics, terms_stats statistics, geo_distance statistics
two。 Similar
3. Reverse check
7. Manage clusters
a. Monitor cluster status and health status
1. Health status: curl http://localhost:9200/_cluster/health?pretty
two。 Index statistics: curl http://localhost:9200/library/_stats?pretty
b. Instance and cluster status diagnosis tool
1.Bigdesk plug-in
2.elasticsearch-head plug-in
3.elasticsearch-paramedic plug-in
4.SPM tool
c. Gateway
1. You can use local, hadoop, Amazon S3
d. Node exploration
1. Zen exploration (zen discovery) is allowed by default, and two exploration methods, multicast and unicast, are provided.
VIII. Problem handling
1. Rebalancing is the process of moving shards between different nodes in a cluster
two。 Preheat: _ warmer
After reading the above, do you have any further understanding of the analysis and principle of ElasticSearch? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.