Apache Calcite official documentation Chinese version-Overview-1. Background 07/06 Update SLTechnology News&Howtos

Apache Calcite official documentation Chinese version-Overview-1. Background

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

The first part is an overview of 1. Background

Apache Calcite is a dynamic data management framework. It contains many classic modules that make up a typical data management system, but omits some key functions: data storage, data processing algorithms and metadata repository.

Calcite deliberately moves away from the task of storing and processing data. As we can see, this makes it the best middle-tier choice between the application and one or more data storage locations and data processing engines. It is also the perfect foundation for building a database: you just need to add data on top of it.

for illustration below, we create an empty Calcite instance and query the data.

Public static class HrSchema {public final Employee [] emps = 0; public final Department [] depts = 0;} Class.forName ("org.apache.calcite.jdbc.Driver"); Properties info = new Properties (); info.setProperty ("lex", "JAVA"); Connection connection = DriverManager.getConnection ("jdbc:calcite:", info); CalciteConnection calciteConnection = connection.unwrap (CalciteConnection.class); SchemaPlus rootSchema = calciteConnection.getRootSchema (); Schema schema = ReflectiveSchema.create (calciteConnection,rootSchema, "hr", new HrSchema ()) RootSchema.add ("hr", schema); Statement statement = calciteConnection.createStatement (); ResultSet resultSet = statement.executeQuery ("select d.deptno, min (e.empid)\ n" + "from hr.emps as e\ n" + "join hr.depts as d\ n" + "on e.deptno = d.deptno\ n" + "group by d.deptno\ n" + "having count (*) > 1"); print (resultSet); resultSet.close (); statement.close () Connection.close ()

you may be confused about the above code, where is the database? There's no database here. Until we call ReflectiveSchema.create to register a java object as schema, and the members of this collection emps and depts as tables, connection is empty.

Calcite doesn't want to manage data, it doesn't even have a standard data format. The above example uses in-memory datasets and uses linq4j libaray's groupBy and join operations to process them, but Calcite also supports data processing in other standard data formats, such as JDBC. In the above example, put the following code

Schema schema = ReflectiveSchema.create (calciteConnection, rootSchema, "hr", new HrSchema ())

To be replaced by:

Class.forName ("com.mysql.jdbc.Driver"); BasicDataSource dataSource = new BasicDataSource (); dataSource.setUrl ("jdbc:mysql://localhost"); dataSource.setUsername ("username"); dataSource.setPassword ("password"); Schema schema = JdbcSchema.create (rootSchema, "hr", dataSource, null, "name")

Calcite can execute the same query through JDBC. For applications, there is no change in data and API, but the underlying implementation is very different. Calcite uses optimization rules to push down JOIN and GROUP BY operations to the source database for execution.

based on memory and based on JDBC are just two familiar examples. Calcite can handle any data source and data format. If we want to add a data source, we need to write an adapter to tell Calcite what collections in the data source should be treated as "table".

If wants to integrate more intelligently, we can write our own optimizer rules. Optimizer rules allow Calcite to process data in new formats and register new operators (such as the more optimized join algorithm), while also allowing Calcite to optimize the process of converting queries into operators. Calcite combines the rules and operators provided by users with the built-in rules and operators of the system to perform cost-based optimization and generate efficient execution plans.

Write Adapter Adapter

Calcite provides adapters for CSV under the example/ CSV subproject. It can well support the functional requirements of the application, and it can also be used as a simple enough example to serve as a reference template if you are writing your own adapter.

For specific ways for to use CSV adapters and write other adapters, see the next chapter 2 tutorials.

The help (HOWTO) section provides more information about using other adapters, as well as common usage scenarios.

Functional status

Calcite provides the following features:

1) query parser, validator, and optimizer

2) read the model in JSON format

3) Standard function and standard aggregate function

4) JDBC query for Linq4j and JDBC backend

5) Linq4j front-end

6) SQL features: SELECT, FROM (including JOIN syntax), WHERE, GROUP BY (including GROUPING SETS), aggregate functions (including COUNT (DISTINCT...)) And FILTER), HAVING, ORDER BY (including NULLS FIRST/LAST), set operations (UNION, INTERSECT, MINUS), subqueries (including related subqueries), window aggregate functions, LIMIT (Postgres syntax); more detailed information is provided in the SQL reference chapter

7) Local and remote JDBC drives, refer to the Avatica section for details

8) multiple adapters

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.