In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the knowledge of "how to use TableAPI&SQL in Flink". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
API in Flink
Flink provides different levels of abstraction for the development of streaming / batch processing applications.
The lowest abstraction of Flink API is stateful real-time stream processing. Its abstract implementation is Process Function, and Process Function is integrated into DataStream API by the Flink framework for our use. It allows users to freely handle events (data) from single or multiple streams in the application, and provides a state with global consistency and fault tolerance. In addition, users can register event time (event time) and processing time (processing time) callback methods in this layer of abstraction, allowing programs to implement complex calculations.
The second layer of abstraction of Flink API is Core APIs. In fact, many applications do not need to use the lowest abstraction of the above API, but can be programmed with Core APIs: it contains two parts: DataStream API (applied to bounded / unbounded data flow scenarios) and DataSet API (applied to bounded dataset scenarios). Streaming API (Fluent API) provided by Core APIs provides common module components for data processing, such as various forms of user-defined transformations (transformations), joins (joins), aggregations (aggregations), windows (window) and state (state) operations. The data types processed in this layer of API have their corresponding classes in each programming language.
The integration of low-level abstractions such as Process Function and DataStream API allows users to choose to use a lower-level abstract API to implement their requirements. DataSet API also provides additional primitives, such as loop / iteration (loop/iteration) operations.
The third layer of abstraction of Flink API is Table API. Table API is a table (Table)-centric declarative programming (DSL) API, such as in a streaming data scenario, which can represent a table that is changing dynamically. Table API follows the (extended) relational model: that is, tables have schema (similar to schema in relational databases), and Table API also provides operations similar to those in relational models, such as select, project, join, group-by, and aggregate. Table API programs define the logical operations that should be performed declaratively, rather than specifying exactly what code the program should perform. Although Table API is simple to use and can be extended by various types of user-defined functions, it is still less expressive than Core API. In addition, the Table API program optimizes user-written expressions using optimization rules in the optimizer before execution.
Tables and DataStream/DataSet can be switched seamlessly, and Flink allows users to mix Table API with DataStream/DataSet API when writing applications.
The top-level abstraction of Flink API is SQL. This layer of abstraction is similar to Table API in semantics and program expressions, but its program implementation is SQL query expressions. The relationship between SQL abstraction and Table API abstraction is very close, and SQL query statements can be executed on tables defined in Table API.
Table API and SQL
Apache Flink has two relational API for unified batch processing: Table API and SQL.
Table API is a query API for Scala and Java languages. It can combine selection, filtering, join and other relational operators in a very intuitive way. Flink SQL is a standard SQL based on Apache Calcite. The queries in both API have the same semantics for DataSet and DataStream inputs and produce the same calculation results.
Table API and SQL API are tightly integrated, as well as DataStream and DataSet API. You can easily switch between these API and some libraries based on these API. For example, you can use CEP to do pattern matching from DataStream, and then use Table API to analyze the matching results; or you can use SQL to scan, filter, aggregate a batch table, and then run a Gelly graph algorithm to process the preprocessed data.
Note: Table API and SQL are still in active development and have not yet fully implemented all the features. Not all [Table API,SQL] and [stream, batch] combinations are supported.
Official document
Https://ci.apache.org/projects/flink/flink-docs-release-1.11/zh/dev/table/
TableAPI&SQL development
Add TableAPI&SQL demonstration content, and expand a tableapi module based on the original project; this module will demonstrate the simple use of TableApi and SQL of the following components
Elasticsearch
Kafka
Jdbc (mysql)
New tableapi module
In the current project, create a maven project module named tableapi
Pom.xml
Tableapi org.apache.flink flink-connector-jdbc_2.12 1.11.1 org.apache.flink flink-table-common 1.11.1 org.apache.flink Flink-connector-kafka_2.11 1.11.1 mysql mysql-connector-java 5.1.47 org.apache.flink flink-connector-elasticsearch7_2.11 ${flink.version}
Refresh the project maven and download the dependent component package for related functions
Engineering module
This is the end of the content of "how to use TableAPI&SQL in Flink". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.