In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly shows you "how to use SQL instead of DSL", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "how to use SQL instead of DSL" this article.
SQL REST API
In Kibana Console, enter:
POST / _ sql?format=txt {"query": "SELECT * FROM library ORDER BY page_count DESC LIMIT 5"}
Replace the above SQL with your own SQL statement, and you can. The format of the return is as follows:
Author | name | page_count | release_date-+----Peter F. Hamilton | Pandora's Star | 768 | 2004-03-02T00: 00:00.000ZVernor Vinge | A Fire Upon the Deep | 613 | 1992-06-01T00:00:00.000ZFrank Herbert | Dune | 604 | 1965-06-01T00:00:00.000ZSQL CLI
Elasticsearch-sql-cli is a script file in the bin directory when you install ES, or you can download it separately. We run it in the ES directory
. / bin/elasticsearch-sql-cli https://some.server:9200
Enter SQL to query
Sql > SELECT * FROM library WHERE page_count > 500ORDER BY page_count DESC Author | name | page_count | release_date-+-Peter F. Hamilton | Pandora's Star | 768 | 1078185600000Vernor Vinge | A Fire Upon the Deep | 613 | | 707356800000Frank Herbert | Dune | 604 |-144720000000SQL To DSL |
In Kibana, enter:
POST / _ sql/translate {"query": "SELECT * FROM library ORDER BY page_count DESC", "fetch_size": 10}
The transformed DSL query can be obtained:
{"size": 10, "docvalue_fields": [{"field": "release_date", "format": "epoch_millis"}], "_ source": {"includes": ["author", "name", "page_count"], "excludes": []} "sort": [{"page_count": {"order": "desc", "missing": "_ first", "unmapped_type": "short"}]}
Because the query-related statements have been generated, we only need to modify it appropriately or not to use DSL happily.
Let's describe in detail the SQL statements supported by ES SQL and how to avoid misuse.
First of all, you need to understand the correspondence between SQL terms and ES terms in the SQL statements supported by ES SQL:
Most of the syntax support of ES SQL follows the ANSI SQL standard, and the SQL statements supported are DML query and some DDL query. DDL queries such as DESCRIBE table,SHOW COLUMNS IN table show a little bit of chicken ribs. Let's take a look at the DML query support for SELECT,Function.
SELECT
The grammatical structure is as follows:
SELECT [TOP [count]] select_expr [,...] [FROM table_name] [WHERE condition] [GROUP BY grouping_element [,...]] [HAVING condition] [ORDER BY expression [ASC | DESC] [,...] [LIMIT [count]] [PIVOT (aggregation_expr FOR column IN (value [AS] alias] [,...]))]
Indicates that row data is obtained from 0murn tables. The order in which SQL is executed is:
Get all the keywords in FROM and determine the table name.
If there is a WHERE condition, filter out all rows that do not match.
If there is a GROUP BY condition, the grouping is aggregated; if there is a HAVING condition, the result of the aggregation is filtered.
The result obtained in the previous step is calculated by select_expr to determine the specific data returned.
If there is an ORDER BY condition, the returned data is sorted.
If there is a LIMIT or TOP condition, a subset of the result of the previous step is returned.
Unlike commonly used SQL, ES SQL supports TOP [count] and PIVOT (aggregation_expr FOR column IN (value [[AS] alias] [,...]) clauses.
TOP [count]: for example, SELECT TOP 2 first_name FROM emp means a maximum of two pieces of data are returned and cannot be shared with LIMIT conditions.
The PIVOT clause performs a row-to-column transformation of the result obtained by its aggregation condition for further operation. I haven't used this before. I don't make introductions.
FUNCTION
Based on the above SQL, we can actually have SQL with filtering, aggregation, sorting, and paging functions. But we need to know more about FUNCTION support in ES SQL in order to write a rich SQL with full-text search, aggregation, and grouping functions. Use SHOW FUNCTIONS to enumerate the supported function names and belonging types.
SHOW FUNCTIONS; name | type-+-AVG | AGGREGATECOUNT | AGGREGATEFIRST | AGGREGATEFIRST_VALUE | AGGREGATELAST | AGGREGATELAST_VALUE | AGGREGATEMAX | AGGREGATEMIN | AGGREGATESUM | AGGREGATE.
We mainly look at the common functions related to aggregation, grouping, and full-text search.
Full-text matching function
MATCH: the equivalent of a match and multi_match query in DSL.
MATCH (field_exp,-- field name constant_exp,-- matching value of field [, options])-- optional
Examples of use:
SELECT author, name FROM library WHERE MATCH (author, 'frank'); author | name-+-Frank Herbert | DuneFrank Herbert | Dune MessiahSELECT author, name, SCORE () FROM library WHERE MATCH (' author ^ 2, name ^ 5', 'frank dune') Author | name | SCORE ()-+-Frank Herbert | Dune | 11.443176Frank Herbert | Dune Messiah | 9.446629
QUERY: the equivalent of a query_string query in DSL.
QUERY (constant_exp-matching value expression [, options])-optional
Examples of use:
SELECT author, name, page_count, SCORE () FROM library WHERE QUERY ('_ exists_: "author" AND page_count: > 200AND (name:/star.*/ OR name:duna~)') Author | name | page_count | SCORE ()-+-Frank Herbert | Dune | 604 | 3.7164764Frank Herbert | Dune Messiah | 331 | 3.4169943
SCORE (): returns the correlation between the input data and the returned data relevance. Examples of use:
SELECT SCORE (), * FROM library WHERE MATCH (name, 'dune') ORDER BY SCORE () DESC SCORE () | author | name | page_count | release_date-+-2. 2886353 | Frank Herbert | Dune | 604 | 1965-06-01T00:00:00Z1.8893257 | Frank Herbert | Dune Messiah | 1969-10-15T00:00:00Z
Aggregate function
AVG (numeric_field): calculates the average of fields of a numeric type.
SELECT AVG (salary) AS avg FROM emp
COUNT (expression): returns the total number of input data, including data whose field_name corresponds to null when COUNT (). COUNT (ALL field_name): returns the total number of input data, excluding data with a value of null corresponding to field_name. COUNT (DISTINCT field_name): returns the total number of field_name values in the input data that are not null. SUM (field_name): returns the sum of the values for the numeric field field_name in the input data. MIN (field_name): returns the minimum value corresponding to the numeric field field_name in the input data. MAX (field_name): returns the maximum value corresponding to the numeric field field_name in the input data.
Grouping function
The grouping function here corresponds to the bucket grouping in DSL.
HISTOGRAM: the syntax is as follows:
HISTOGRAM (numeric_exp,-- numeric expression, usually a field_name numeric_interval-- interval value of a number) HISTOGRAM (date_exp,-- date/time expression, usually an interval value of a field_name date_time_interval-- date/time)
The data of births in the early morning of January 1 each year are returned as follows:
ELECT HISTOGRAM (birth_date, INTERVAL 1 YEAR) AS h, COUNT (*) AS c FROM emp GROUP BY h H | c----+-null | 101952-01-01T00:00:00.000Z | 81953-01-01T00:00:00.000Z | 111954-01-01T00:00:00.000Z | 81955-01-01T00:00:00.000Z | 41956-01-01T00:00:00.000Z | 51957-01- 01T00:00:00.000Z | 41958-01-01T00:00:00.000Z | 71959-01-01T00:00:00.000Z | 91960-01-01T00:00:00.000Z | 81961-01-01T00:00:00.000Z | 81962-01-01T00:00:00.000Z | 61963-01-01T00:00:00.000Z | 71964-01-01T00:00:00.000Z | 41965-01-01T00:00:00.000Z | 1ES SQL limitation
Because ES SQL and ES DSL do not exactly match in functionality, the limitations of SQL mentioned in the official documentation are:
Large queries may throw ParsingException
During the parsing phase, extremely large queries take up too much memory, in which case the Elasticsearch SQL engine aborts parsing and throws an error.
The representation of nested type field
Fields of type nested are not supported in SQL, you can only use the
[nested_field_name]. [sub_field_name]
This form refers to embedded subfields. Examples of use:
SELECT dep.dep_name.keyword FROM test_emp GROUP BY languages
Nested type fields cannot be used on Scalar functions of where and order by
Such as the following SQL are all wrong
SELECT * FROM test_emp WHERE LENGTH (dep.dep_name.keyword) > 5th select * FROM test_emp ORDER BY YEAR (dep.start_date)
Simultaneous query of multiple nested fields is not supported
For example, the nested fields nested_A and nested_B cannot be used together.
Nested inner field paging limit
When a paging query has a nested field, the paging results may be incorrect. This is because paging queries in ES take place on Root nested document, not on its inner fields.
Fields of type keyword do not support normalizer
Fields of array type are not supported
This is because in SQL, a field corresponds to only one value, in which case we can use the API of SQL To DSL described above to convert it into a DSL statement and query it with DSL.
Restrictions on aggregate sorting
The sort field must be a field in the aggregation bucket. ES SQL CLI breaks this limit, but the upper limit cannot exceed 512 lines, otherwise an exception will be thrown during the sorting phase. It is recommended to use with Limit clause, such as:
SELECT * FROM test GROUP BY age ORDER BY COUNT (*) LIMIT 100
The sort condition of aggregate sorting does not support Scalar functions or simple operator operations. Complex fields after aggregation (such as containing aggregate functions) cannot also be used in sorting conditions.
The following are examples of errors:
SELECT age, ROUND (AVG (salary)) AS avg FROM test GROUP BY age ORDER BY avg;SELECT age, MAX (salary)-MIN (salary) AS diff FROM test GROUP BY age ORDER BY diff
Restrictions on subqueries
The subquery contains GROUP BY or HAVING or SELECT X FROM (SELECT.) WHERE [simple_condition] this kind of structure is complex, it is possible to execute unsuccessfully.
Fields of TIME data type do not support GROUP BY condition and HISTOGRAM function
Such as the following query is incorrect:
SELECT count (*) FROM test GROUP BY CAST (date_created AS TIME); SELECT HISTOGRAM (CAST (birth_date AS TIME), INTERVAL '10' MINUTES) as h, COUNT (*) FROM t GROUP BY h
However, wrapping fields of type TIME as returned by the Scalar function supports GROUP BY, such as:
SELECT count (*) FROM test GROUP BY MINUTE ((CAST (date_created AS TIME)
The limits of returned fields cannot be queried if a field is not stored in source. Fields of the keyword, date, scaled_float, geo_point, and geo_shape types are not subject to this restriction, because they are returned not from _ source, but from docvalue_fields.
The above is all the contents of the article "how to use SQL instead of DSL". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.