Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use SQL instead of DSL

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "how to use SQL instead of DSL", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "how to use SQL instead of DSL" this article.

SQL REST API

In Kibana Console, enter:

POST / _ sql?format=txt {"query": "SELECT * FROM library ORDER BY page_count DESC LIMIT 5"}

Replace the above SQL with your own SQL statement, and you can. The format of the return is as follows:

Author | name | page_count | release_date-+----Peter F. Hamilton | Pandora's Star | 768 | 2004-03-02T00: 00:00.000ZVernor Vinge | A Fire Upon the Deep | 613 | 1992-06-01T00:00:00.000ZFrank Herbert | Dune | 604 | 1965-06-01T00:00:00.000ZSQL CLI

Elasticsearch-sql-cli is a script file in the bin directory when you install ES, or you can download it separately. We run it in the ES directory

. / bin/elasticsearch-sql-cli https://some.server:9200

Enter SQL to query

Sql > SELECT * FROM library WHERE page_count > 500ORDER BY page_count DESC Author | name | page_count | release_date-+-Peter F. Hamilton | Pandora's Star | 768 | 1078185600000Vernor Vinge | A Fire Upon the Deep | 613 | | 707356800000Frank Herbert | Dune | 604 |-144720000000SQL To DSL |

In Kibana, enter:

POST / _ sql/translate {"query": "SELECT * FROM library ORDER BY page_count DESC", "fetch_size": 10}

The transformed DSL query can be obtained:

{"size": 10, "docvalue_fields": [{"field": "release_date", "format": "epoch_millis"}], "_ source": {"includes": ["author", "name", "page_count"], "excludes": []} "sort": [{"page_count": {"order": "desc", "missing": "_ first", "unmapped_type": "short"}]}

Because the query-related statements have been generated, we only need to modify it appropriately or not to use DSL happily.

Let's describe in detail the SQL statements supported by ES SQL and how to avoid misuse.

First of all, you need to understand the correspondence between SQL terms and ES terms in the SQL statements supported by ES SQL:

Most of the syntax support of ES SQL follows the ANSI SQL standard, and the SQL statements supported are DML query and some DDL query. DDL queries such as DESCRIBE table,SHOW COLUMNS IN table show a little bit of chicken ribs. Let's take a look at the DML query support for SELECT,Function.

SELECT

The grammatical structure is as follows:

SELECT [TOP [count]] select_expr [,...] [FROM table_name] [WHERE condition] [GROUP BY grouping_element [,...]] [HAVING condition] [ORDER BY expression [ASC | DESC] [,...] [LIMIT [count]] [PIVOT (aggregation_expr FOR column IN (value [AS] alias] [,...]))]

Indicates that row data is obtained from 0murn tables. The order in which SQL is executed is:

Get all the keywords in FROM and determine the table name.

If there is a WHERE condition, filter out all rows that do not match.

If there is a GROUP BY condition, the grouping is aggregated; if there is a HAVING condition, the result of the aggregation is filtered.

The result obtained in the previous step is calculated by select_expr to determine the specific data returned.

If there is an ORDER BY condition, the returned data is sorted.

If there is a LIMIT or TOP condition, a subset of the result of the previous step is returned.

Unlike commonly used SQL, ES SQL supports TOP [count] and PIVOT (aggregation_expr FOR column IN (value [[AS] alias] [,...]) clauses.

TOP [count]: for example, SELECT TOP 2 first_name FROM emp means a maximum of two pieces of data are returned and cannot be shared with LIMIT conditions.

The PIVOT clause performs a row-to-column transformation of the result obtained by its aggregation condition for further operation. I haven't used this before. I don't make introductions.

FUNCTION

Based on the above SQL, we can actually have SQL with filtering, aggregation, sorting, and paging functions. But we need to know more about FUNCTION support in ES SQL in order to write a rich SQL with full-text search, aggregation, and grouping functions. Use SHOW FUNCTIONS to enumerate the supported function names and belonging types.

SHOW FUNCTIONS; name | type-+-AVG | AGGREGATECOUNT | AGGREGATEFIRST | AGGREGATEFIRST_VALUE | AGGREGATELAST | AGGREGATELAST_VALUE | AGGREGATEMAX | AGGREGATEMIN | AGGREGATESUM | AGGREGATE.

We mainly look at the common functions related to aggregation, grouping, and full-text search.

Full-text matching function

MATCH: the equivalent of a match and multi_match query in DSL.

MATCH (field_exp,-- field name constant_exp,-- matching value of field [, options])-- optional

Examples of use:

SELECT author, name FROM library WHERE MATCH (author, 'frank'); author | name-+-Frank Herbert | DuneFrank Herbert | Dune MessiahSELECT author, name, SCORE () FROM library WHERE MATCH (' author ^ 2, name ^ 5', 'frank dune') Author | name | SCORE ()-+-Frank Herbert | Dune | 11.443176Frank Herbert | Dune Messiah | 9.446629

QUERY: the equivalent of a query_string query in DSL.

QUERY (constant_exp-matching value expression [, options])-optional

Examples of use:

SELECT author, name, page_count, SCORE () FROM library WHERE QUERY ('_ exists_: "author" AND page_count: > 200AND (name:/star.*/ OR name:duna~)') Author | name | page_count | SCORE ()-+-Frank Herbert | Dune | 604 | 3.7164764Frank Herbert | Dune Messiah | 331 | 3.4169943

SCORE (): returns the correlation between the input data and the returned data relevance. Examples of use:

SELECT SCORE (), * FROM library WHERE MATCH (name, 'dune') ORDER BY SCORE () DESC SCORE () | author | name | page_count | release_date-+-2. 2886353 | Frank Herbert | Dune | 604 | 1965-06-01T00:00:00Z1.8893257 | Frank Herbert | Dune Messiah | 1969-10-15T00:00:00Z

Aggregate function

AVG (numeric_field): calculates the average of fields of a numeric type.

SELECT AVG (salary) AS avg FROM emp

COUNT (expression): returns the total number of input data, including data whose field_name corresponds to null when COUNT (). COUNT (ALL field_name): returns the total number of input data, excluding data with a value of null corresponding to field_name. COUNT (DISTINCT field_name): returns the total number of field_name values in the input data that are not null. SUM (field_name): returns the sum of the values for the numeric field field_name in the input data. MIN (field_name): returns the minimum value corresponding to the numeric field field_name in the input data. MAX (field_name): returns the maximum value corresponding to the numeric field field_name in the input data.

Grouping function

The grouping function here corresponds to the bucket grouping in DSL.

HISTOGRAM: the syntax is as follows:

HISTOGRAM (numeric_exp,-- numeric expression, usually a field_name numeric_interval-- interval value of a number) HISTOGRAM (date_exp,-- date/time expression, usually an interval value of a field_name date_time_interval-- date/time)

The data of births in the early morning of January 1 each year are returned as follows:

ELECT HISTOGRAM (birth_date, INTERVAL 1 YEAR) AS h, COUNT (*) AS c FROM emp GROUP BY h H | c----+-null | 101952-01-01T00:00:00.000Z | 81953-01-01T00:00:00.000Z | 111954-01-01T00:00:00.000Z | 81955-01-01T00:00:00.000Z | 41956-01-01T00:00:00.000Z | 51957-01- 01T00:00:00.000Z | 41958-01-01T00:00:00.000Z | 71959-01-01T00:00:00.000Z | 91960-01-01T00:00:00.000Z | 81961-01-01T00:00:00.000Z | 81962-01-01T00:00:00.000Z | 61963-01-01T00:00:00.000Z | 71964-01-01T00:00:00.000Z | 41965-01-01T00:00:00.000Z | 1ES SQL limitation

Because ES SQL and ES DSL do not exactly match in functionality, the limitations of SQL mentioned in the official documentation are:

Large queries may throw ParsingException

During the parsing phase, extremely large queries take up too much memory, in which case the Elasticsearch SQL engine aborts parsing and throws an error.

The representation of nested type field

Fields of type nested are not supported in SQL, you can only use the

[nested_field_name]. [sub_field_name]

This form refers to embedded subfields. Examples of use:

SELECT dep.dep_name.keyword FROM test_emp GROUP BY languages

Nested type fields cannot be used on Scalar functions of where and order by

Such as the following SQL are all wrong

SELECT * FROM test_emp WHERE LENGTH (dep.dep_name.keyword) > 5th select * FROM test_emp ORDER BY YEAR (dep.start_date)

Simultaneous query of multiple nested fields is not supported

For example, the nested fields nested_A and nested_B cannot be used together.

Nested inner field paging limit

When a paging query has a nested field, the paging results may be incorrect. This is because paging queries in ES take place on Root nested document, not on its inner fields.

Fields of type keyword do not support normalizer

Fields of array type are not supported

This is because in SQL, a field corresponds to only one value, in which case we can use the API of SQL To DSL described above to convert it into a DSL statement and query it with DSL.

Restrictions on aggregate sorting

The sort field must be a field in the aggregation bucket. ES SQL CLI breaks this limit, but the upper limit cannot exceed 512 lines, otherwise an exception will be thrown during the sorting phase. It is recommended to use with Limit clause, such as:

SELECT * FROM test GROUP BY age ORDER BY COUNT (*) LIMIT 100

The sort condition of aggregate sorting does not support Scalar functions or simple operator operations. Complex fields after aggregation (such as containing aggregate functions) cannot also be used in sorting conditions.

The following are examples of errors:

SELECT age, ROUND (AVG (salary)) AS avg FROM test GROUP BY age ORDER BY avg;SELECT age, MAX (salary)-MIN (salary) AS diff FROM test GROUP BY age ORDER BY diff

Restrictions on subqueries

The subquery contains GROUP BY or HAVING or SELECT X FROM (SELECT.) WHERE [simple_condition] this kind of structure is complex, it is possible to execute unsuccessfully.

Fields of TIME data type do not support GROUP BY condition and HISTOGRAM function

Such as the following query is incorrect:

SELECT count (*) FROM test GROUP BY CAST (date_created AS TIME); SELECT HISTOGRAM (CAST (birth_date AS TIME), INTERVAL '10' MINUTES) as h, COUNT (*) FROM t GROUP BY h

However, wrapping fields of type TIME as returned by the Scalar function supports GROUP BY, such as:

SELECT count (*) FROM test GROUP BY MINUTE ((CAST (date_created AS TIME)

The limits of returned fields cannot be queried if a field is not stored in source. Fields of the keyword, date, scaled_float, geo_point, and geo_shape types are not subject to this restriction, because they are returned not from _ source, but from docvalue_fields.

The above is all the contents of the article "how to use SQL instead of DSL". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report