What is the architecture of Orca? 04/22 Update SLTechnology News&Howtos

What is the architecture of Orca?

2025-04-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the structure of Orca, which is very detailed and has a certain reference value. Friends who are interested must read it!

The popularity of big data system has aroused new interest in query optimization, because the new data management system has made a breakthrough in unprecedented scalability, availability and processing power. Through SQL or SQL-like interfaces, you can easily access hundreds of TB or even PB large data sets for analysis. The differences between excellent and peaceful optimizers have always been well known. However, these systems have to deal with an increasing amount of data, which aggravates optimization errors and emphasizes the importance of query optimization more than ever before.

A good query optimizer needs to consider the following aspects in architectural design:

Modularization. Highly scalable abstractions that use metadata and system descriptions are no longer limited to specific host systems, such as traditional optimizers. Instead, it can be quickly migrated to other data management systems through data-supported plug-ins.

Scalability. Avoid the trap of multi-stage optimization by representing all elements of the query and its optimization as first-class citizens with equal status, at which some optimizations will be processed afterwards. It is well known that the multi-phase optimizer is difficult to extend because new optimizations or query constructs often do not match the previously set phase boundaries.

Support for multi-core architecture. The system needs to deploy an efficient multi-core aware scheduler, which can allocate fine-grained optimization subtasks among multiple kernels to speed up the optimization process.

Verifiability. There are established special rules with built-in mechanism levels to ensure correctness and performance. In addition to improving engineering practices, these tools enable rapid development with a high degree of confidence and shorten the turnaround time for new features and bug fixes.

Performance. The performance of the query is the result we most want to wait for.

I. MPP architecture

Orca is a big data modular query optimizer developed by Pivatol and used in Greemplum and Hawq. He designed it in accordance with the above requirements.

The figure above is the architecture of Greemplum that handles the storage and processing of large amounts of data by distributing load among multiple servers or hosts to create a single database array, all of which work together to present a single database entry. The master node is the entry point for GPDB, where clients can connect and submit SQL statements. The master node works with other database instances (called segments) to handle data processing and storage. When the query is submitted to the master node, it is optimized and decomposed into smaller queries, which are assigned to parts to collaborate to deliver the final results. The interconnection uses a standard Gigabit Ethernet switching fabric and is responsible for the networking layer of interprocess communication between segments.

During query execution, data can be distributed to segments in a variety of ways, including hash distribution, where tuples are distributed to segments and replicated according to some hash function, where complete copies of the table are stored in each segment and singleton distribution. where the entire distributed table is collected from multiple segments to a single host (usually the primary node).

II. SQL on Hadoop architecture

Processing analytical queries on Hadoop is becoming more and more popular. Initially, the appeal of queries expressed as MapReduce work and Hadoop lies in their scalability and fault tolerance. Coding, manually optimizing and maintaining complex queries in MapReduce is very difficult, so declarative languages like SQL are developed on top of Hadoop. The HiveQL query is compiled into a MapReduce job and executed by Hadoop. HiveQL speeds up the coding of complex queries, but it is also a clear indication of the Hadoop ecosystem because compiled MapReduce jobs show poor performance.

Pivotal meets the challenge by introducing HAWQ, a massively parallel SQL compatible engine on top of HDFS. HAWQ takes Orca as the core to design an effective query plan, thus minimizing the cost of accessing data in the Hadoop cluster. The architecture of HAWQ combines the innovative state-of-the-art cost-based optimizer with the scalability and fault tolerance of Hadoop to achieve interactive processing of PB-level data.

Recently, many other efforts, including Cloudera's Impala and Facebook's Presto, have introduced new optimizers to apply SQL processing on Hadoop. Currently, these efforts support only part of the functions of the SQL standard, and their optimization is limited to rule-based. By contrast, HAWQ has a fully standards-compliant SQL interface and a cost-based optimizer, both of which are unprecedented features in the Hadoop query engine.

III. The structure of Orca

Orca is a new query optimizer for data management products developed by Pivotal, including GPDB and HAWQ. Orca is a modern top-down query optimizer based on the Cascades optimization framework. Although many Cascades optimizers are tightly coupled to the host system, the unique feature of Orca is that it can run outside the database system as an independent optimizer. This feature is critical for using an optimizer to support products with different computing architectures, such as MPP and Hadoop.

1. DXL

To separate the optimizer from the database system, it is necessary to establish a communication mechanism to process the query. Orca includes a framework for exchanging information between the optimizer and the database system called data Exchange language (DXL). The framework uses XML-based languages to encode the information needed for communication, such as input queries, output plans and metadata. Overlaid on DXL is a simple communication protocol that sends the initial query structure and retrieves the optimized plan. The main advantage of DXL is that it packages Orca as a stand-alone product.

The input to Orca is the DXL query, and the output of Orca is the DXL plan. During tuning, you can query the database system for metadata (for example, table definitions). Orca abstracts metadata access details by allowing the database system to register a metadata provider (MD Provider), which is responsible for serializing the metadata into DXL and then sending it to Orca. You can also use metadata from regular files that contain metadata objects serialized in DXL format.

The database system needs to include converters that use / send data in DXL format. The Query2DXL converter converts the query analysis tree into a DXL query, while the DXL2Plan converter converts the DXL plan into an executable plan. The implementation of such converters is done entirely outside of Orca, which allows multiple systems to use Orca by providing appropriate converters. The architecture of Orca is highly scalable. All components can be replaced and configured separately.

2. Different component packages of Orca

(1) Memo

The space of the planned alternatives generated by the optimizer is encoded in a compact in-memory data structure called Memo. The Memo structure consists of a set of containers called groups, where each group contains logically equivalent expressions. The Memo group captures the different child targets of the query (for example, a filter on a table or a join of two tables). Group members, called group expressions, achieve group goals in different logical ways (for example, different join orders). Each group expression is an operator with other groups as its children. This recursive structure of Memo allows compact coding of a large amount of space for possible plans.

(II) search and job scheduler

Orca uses a search mechanism to navigate through the space of possible planned alternatives and to determine the plan at the lowest estimated cost. The search mechanism is enabled by a specialized Job Scheduler, which uses three main steps to create dependent or parallel units of work to perform query optimization: exploration (where equivalent logical expressions are generated), implementation (where physical plans are generated) and optimization (in the required physical attributes, for example, sort order), and plan the cost of alternatives.

(3) deformation

By applying transformation rules that can produce equivalent logical expressions (for example, InnerJoin (A _ Magi B) → InnerJoin (B)) or physical implementations of existing expressions (for example, Join (A _ Magi B), you can generate plan alternatives) → HashJoin (A _ Magi B)). The results of applying the translation rules are copied to Memo, which may result in the creation of new groups and / or the addition of new group expressions to existing groups. Each transformation rule is a separate component that can be explicitly activated / disabled in the Orca configuration.

(4) attribute coercion

Orca includes an extensible framework for describing query requirements and plan characteristics according to formal attribute specifications. Properties have different types, including logical properties (for example, output columns), physical properties (for example, sort order and data distribution), and scalar properties (for example, columns used in join conditions). During query optimization, each operator can request specific properties from its children. The optimized subplan may satisfy the desired attributes on its own (for example, the IndexScan plan provides sorted data), or you may need to insert an enforcement program (for example, the Sort operator) into the plan to deliver the required attributes. The framework allows each operator to control the location of execution based on the properties of the subplan and the local behavior of the operator.

(5) metadata caching

Because metadata (for example, table definitions) rarely changes, there is overhead to transfer it each time a query is made. Orca caches metadata on the optimizer side and retrieves metadata from the directory only if it is not available in the cache or has changed since it was last loaded into the cache. The metadata cache also extracts database system details from the optimizer, which is particularly useful during testing and debugging.

(6) GPOS

To interact with operating systems that may have different API, Orca uses an OS abstraction layer called GPOS. The GPOS layer provides a wide range of infrastructure for Orca, including memory managers, primitives for concurrency control, exception handling, file I / O, and synchronous data structures.

Fourth, query optimization

1. Work flow

We use the following query to explain the query optimization process:

SELECT T1.a FROM T1, T2 WHERE T1.a = T2.b ORDER BY T1.a

The distribution of T1 is Hashed (T1.a) and that of T2 is Hashed (T2.a).

The following XML shows the representation of the previous query in DXL, where we provide the required output columns, sorting, data distribution, and logical query. Metadata, such as table and operator definitions, is decorated with metadata ID (Mdid) to allow more information to be requested during optimization. Mdid is a unique identifier consisting of a database system identifier, an object identifier, and a version number. For example, "0.96.1.0" refers to the integer equality operator with GPDB version "1.0". The metadata version is used to invalidate cache metadata objects that have been modified across queries.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.