Technical interpretation of POLARDB v2.0 04/26 Update SLTechnology News&Howtos

Technical interpretation of POLARDB v2.0

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Review POLARDB 1.0

The main improvements of POLARDB 1.0 include the adoption of a computing and storage separation architecture, full compatibility with MYSQL, and six times the performance of native MySQL. A user cluster can be flexibly expanded to 16 computing nodes at the minute level, and the computing and storage agents are completely transparent to the business, with a delay of only milliseconds from the library. The storage is distributed block storage, which can be flexibly scaled to the size of 100TB. Multi-copy technology is used in the storage layer, which makes the RPO of the database 0, and there is no risk of losing data at all.

POLARDB 1.0 perfectly solves the following pain points of traditional databases:

1. Data migration is required to upgrade hardware, and the upgrade cycle is long, so it is impossible to cope with the sudden business peak calmly.

(the computing node of POLARDB can be expanded in minutes, and the capacity can be expanded rapidly at any time when a sudden change in traffic is found. )

2. Financial-level reliability requires RPO=0. Traditional architecture uses instance layer to synchronize multiple copies, resulting in huge performance loss.

(the storage of POLARDB is multiple copies, and the underlying layer uses the latest software and hardware technologies such as RDMA, Parallel Raft, 3D Xpoint, etc., and the performance is up to 6 times higher than that of the traditional architecture. )

3. The instance layer replicates the HA architecture, and the master-slave switching time is long, which can not meet the financial-level continuity requirements.

(POLARDB uses shared storage, and master-slave switching can be achieved in seconds. At the same time, there is a proxy layer between the computing layer and the business layer, which can help users identify the anomalies of computing nodes and switch automatically. In most of the time, the business is not aware of the switching of computing nodes, which ensures business continuity. )

4. The traditional HA architecture adopts master-slave asynchronous replication, and the slave database may need to be rebuilt after switching, which consumes a lot of resources, takes a long time to rebuild, and has a single point of failure for a long time.

(POLARDB uses a shared storage architecture, and master-slave switching does not require data reconstruction. )

5. Each read-only node needs a copy exactly the same as the master, which is expensive.

(POLARDB uses a shared storage architecture to increase the number of computing nodes without increasing the number of storage copies, which makes the overall cost much lower than that of the traditional architecture. )

6. Logical REDO replication is used to separate read and write, and the master-slave delay is high.

(the data of POLARDB is stored in shared storage, so there is no need to synchronize REDO data, only the bit of REDO is needed, and the master-slave delay is in millisecond. )

7. Sharding architecture is not as good as expected, with castrated functions and huge intrusions into business (there are more restrictions on SQL).

POLARDB is fully compatible with MYSQL without any intrusion into the business, and users do not need to modify a single line of code to use POLARDB. )

8. Backups of instances above TB are slow, often for tens of hours.

(POLARDB uses snapshot backup technology to back up seconds regardless of the amount of data.)

In the two years since its release, POLARDB 1.0 has won the favor of many enterprise customers. POLARDB 1.0 is perfect, so why do we have to develop 2.0?

Why research and develop 2.0?

1. Users have a strong demand for de-O, but they have failed again and again.

Why do many users try and fail again and again when they go to O?

1. The enterprise has a very serious historical burden

1.1.The enterprise technology stack is usually Oracle technology stack (team), which adapts to the long cycle of other products and is difficult to turn around.

1.2. If migration involves a large number of code modifications, the cycle is long, the risk is high and the benefit is low.

Generally speaking, the Oracle compatibility of the target engine database is very poor, and users need a lot of modification.

2. Lack of effective migration methods and tools.

2.1. The workload of migration and transformation is difficult to evaluate, the migration cycle is difficult to evaluate, and the cycle is usually very long (the successful experience of others cannot be replicated)

2.2. There are no effective tools for data migration, data verification and simulation. The risk of slapping the head is very high.

3. There are many target database engines and the selection is difficult.

3.1. Some enterprises go to O in order to go to O, which does not produce business value, and the enterprise has no motivation.

3.2. The reliability, security, scalability, compatibility, stability, performance and availability of the target engine may not meet the needs of users.

2. the enterprise requirements of the database, not only want, but also?

Enterprises require that the database should be not only SQL universal, but also NoSQL expansibility, but also multi-mode data processing convenience. Both high concurrency and real-time complex analysis are needed. However, the traditional database can not meet the needs of both needs and needs. Traditional databases often use multiple copies of data synchronization (like cobwebs), and different scenarios use different product solutions. It causes a lot of problems, and users are miserable:

1. High cost of software and hardware, synchronization delay and inconsistent synchronization data.

2. Headaches such as high development costs and complex errors hinder the development of enterprise business!

3. The historical data of enterprises are as overwhelmed as Wuzhishan.

The life cycle of an enterprise's database is usually very long. During the whole life cycle, there will be a lot of forgotten "temporary" data (such as business history database, temporary data that developers or DBA have operated or generated in the database. After several years, these temporary data may not be able to tell what business it belongs to, whether it should be used, whether it can be deleted, and so on. ) slowly, like a "chicken rib", the food is tasteless and can't be abandoned. A large number of cold data like "chicken ribs" take up a lot of space and cannot be deleted. It has gradually become a heavy burden on the database.

(database storage is expensive, backup consumption is large, space consumption is large, and recovery is slow.)

4. When dealing with professional GIS scenarios, the performance and functionality of the open source version cannot be satisfied?

With the development of the Internet of things, intelligent terminals and mobile Internet, more and more mobile data access, applications for GIS data processing demand will be more and more exuberant, according to analysis GIS is already hundreds of billions of market scale, but open source GIS products may not be able to meet the growing demand.

5. Advanced DBA is too difficult to find and expensive.

Senior DBA is a position set up by large enterprises, which is expensive and lack of talent. Their daily routine may be drinking tea and talking about life, everything is under control, and problems have not yet been prevented. And this kind of DBA is usually difficult to find.

Most companies usually work in SA or developing part-time DBA, and they may have both on a daily basis. Often something happens to the database to deal with, the so-called technical industry specializes in, SA or developers to deal with database problems (whether performance problems or management issues), usually for a long time.

2.0 release of new features

POLARDB 2.0 completely inherits the architecture of 1.0 and is compatible with Oracle and PostgreSQL, two other popular databases.

POLARDB for PostgreSQL

Fully compatible with PostgreSQL, support computing and storage separation, independent scaling, pay-by-storage. Suitable for the core business of medium and large enterprises.

[OLTP+OLAP mixed load]

Support mixed load business, support millions of high concurrency, support parallel computing, support session-level resource isolation.

An example, a piece of data, while supporting online services, real-time analysis of mixed services.

It turns out that users need to synchronize data from online database to data warehouse, and there are many problems. POLARDB v2.0 solves the problems of delay, consistency, cost and usage habits caused by cross-product data synchronization.

1. Technical indicators:

Supports up to 16 compute nodes with 88 cores for each phase node

Millions of QPS can be provided per compute node

Support parallel computing that is completely transparent to the business, with an average speed of more than 20 times, without fear of complex SQL

[multi-mode calculation]

Multi-mode computing covers GIS, space-time, time series, full-text retrieval, image recognition, multi-dimensional query, vector similarity and machine learning.

It turns out that users need many products to solve the problems encountered in the above different business scenarios, and data needs to be synchronized among various products. Heterogeneous synchronization brings problems such as delay, consistency, cost, usage habits and so on.

The new engine in POLARDB v2.0 solves the above problems.

1. Technical indicators:

Ganos professional spatio-temporal components are compatible with GIS standards. The performance of MOD model is 50-100times higher than that of PostGIS.

Built-in multi-mode components such as full-text retrieval, image recognition, multi-dimensional query, vector computing, industrial timing, etc.

Built-in nosql features such as schemaless and KV

Support up to 8 index interfaces (btree,hash,gin inverted index, GiST spatial index, SP-GiST spatial partition index, BRIN time series index, rum full-text index, bloom Bloom index) to meet the

The requirement of high-speed retrieval of all kinds of multimode data

POLARDB for Oracle

Highly compatible with Oracle, reduce the risk of Oracle migration, shorten the migration cycle, and help enterprises quickly replace Oracle and enter the era of cloud intelligence.

[deep Oracle compatibility]

Greatly reduce the risk of de-O and shorten the cycle of de-O. The number of users going to O decreased from years to weeks.

1. Technical indicators:

SQL syntax, types, functions, PL/SQL, packages, system views, OCI, PRO*C and other comprehensive compatibility Oracle; compatible with Oracle partition tables, heterogeneous queries, HINT and other advanced functions; support 3155 functions, 26 packages, 317 in-package methods, 88 system views

[intelligent driving]

POLARDB v2.0 for Oracle version with built-in SQL firewall. It can prevent misoperation of SQL injection and SQL. Solve the database security problem of the enterprise.

POLARDB v2.0 for Oracle version with built-in index recommendation. It is a good helper for enterprise database optimization, which can solve the problem of index optimization with one click.

POLARDB v2.0, which supports AAS performance insight. In the absence of professional DBA, you can gain insight into macro and micro business problems with one click. Help enterprises to find business problems in time.

1. Technical indicators:

SQL learning mode to prevent SQL injection and SQL misoperation; index recommendation, one-click solution to index optimization problems; AAS performance insight, one-click insight into macro and micro business problems

[cloud origin]

Using POLARDB v2.0 instead of ORACLE, you can gain the powerful cloud native capabilities of POLARDB. Through the oss_fdw interface, you can read and write OSS data, support the separation of hot and cold, and dock with massive computing power (functional computing, MAXCompute) in the cloud to obtain powerful data processing capabilities. Enterprises are speeding up to the DT era.

1. Technical indicators:

OSS external table, hot and cold data can be stored separately, and historical data can be stored as long as you want; seamlessly connect massive computing power in the cloud (ADB, MaxCompute, OSS function computing, etc.); 2.0 is suitable for which business scenarios and customers

1. Applicable scenarios

Replace Oracle database, enterprise core database, GIS spatio-temporal database

2. Suitable for customers

Enterprise customers (party, government and military, medical, new retail, new manufacturing, scientific research institutions, finance, Internet of things, transportation, aviation, maps, meteorology, surveying and mapping, LBS, land, GIS and other professional fields)

2.0 interpretation of key technical points

1. Intelligent driving

1. SQL firewall to prevent SQL injection and misoperation.

According to the principle behind SQL firewall, POLARDB v2.0 for Oracle learns SQL requests initiated by business by opening SQL learning mode. The database variates SQL requests, converts them to SQL HASH, and stores them as SQL whitelists.

When the learning mode is over, you can turn on the permission mode and issue a warning if there is a SQL request that is not on the whitelist. DBA can follow this warning and determine whether it is an abnormal request.

Users can also change the mode to mandatory mode, and if there are SQL requests that are not in the whitelist, they will reject such requests, thus fundamentally preventing SQL injection and misoperation.

In addition, POLARDB v2.0 for Oracle also supports rule configuration, such as denying DML requests without WHERE conditions and rejecting DML requests whose WHERE condition is always TRUE, thus preventing SQL injection attacks or human misoperation.

2. Index recommendation that even novice users of the database can optimize the database with one click.

Users can open the index recommendation module in the session. Once enabled, the SQL request initiated by this session will be analyzed by the background. After a period of time, the index recommendation function is called. We can see that the database has optimized the index recommendation for the SQL executed in the current session.

3, performance insight, this function is very powerful, through waiting time collection, management, we can observe whether the database encountered performance bottleneck at any time in the past, what is the performance bottleneck? Even if there is no professional DBA in the enterprise, database performance problems can be easily found.

2. Parallel computing, up to dozens of scenarios, with an average performance improvement of 20 times

Parallel computing solves the problem of complex query slowness. in enterprises, we usually have the need for data analysis. in the past, because of the poor analysis and computing ability of relational database, it is necessary to synchronize the data of relational database to big data platform for analysis. synchronization will have delays, costs, synchronization problems and so on. Users are miserable. POLARDB v2.0 has built-in parallel computing capabilities, and parallelism is planned according to the cost of SQL (a measure of complexity). Complex SQL enables parallel computing, and parallelism is calculated automatically. So that users do not need to synchronize the data to the outside, but also can achieve real-time analysis. The parallel computing of POLARDB v2.0 covers dozens of scenarios, and the measured performance is improved by more than 20 times on average.

3. Session level resource isolation

When users have OLTP services and mix OLAP services at the same time, the concurrency of OLTP is high and the required RT is low. The concurrency of OLAP is low, but it has high requirements for computing, so running OLAP business will take up a lot of resources. POLARDB supports 16 computing nodes, and we can use different computing nodes to isolate OLAP,OLTP services. However, if the user's TP and AP services are on the same computing node, there is a better way to isolate session-level resources, which currently supports CPU and IO resource isolation.

4. Ganos space-time multimode module

Ganos is a 3s engine developed by Alibaba, which is compatible with GIS standard, and supports plane geometric model, spherical geometric model, grid model, spatio-temporal trajectory model, point cloud model, topological network model and so on.

The advantages of ganos over open source GIS are also obvious.

5. the separation of cold and heat from the primary cloud

POLARDB v2.0 can use OSS as a data store, and users can create oss_fdw external table plug-ins to create OSS external tables, which can be written to OSS or read from OSS. The standard SQL interface is adopted. Therefore, for accessing less cold data, users can store the data in OSS, reduce the cost of distributed block storage of the database, and get unlimited storage space. At the same time, because OSS is connected with MAXCompute, ADB, and function computing in the cloud, when users are very large enterprises and need horizontal big data analysis of multiple database instances, OSS_FDW is undoubtedly a very good data sharing method, analyzing the data of multiple instances through OSS to open up large computing.

6. Why does 2.0 support multimode

1. Traditional databases usually support only one index, while POLARDB v2.0 supports eight indexes.

Btree 、 hash 、 gin 、 brin 、 gist 、 spgist 、 bloom 、 rum

2. Traditional databases usually support only a few data types, while POLARDB v2.0 supports a large number of data types.

Time, string, numeric, currency, byte stream, bit, enumeration, Boolean, geometry, network, full-text search, UUID,JSON,XML, array, composition, range, domain, image, tree, multidimensional cube, GIS,rb,HLL,K-V, and support for extension types

3. POLARDB v2.0 also supports a lot of multimode plug-ins, which greatly help users to improve the efficiency of development and production.

Summary

POLARDB v2.0 for Oracle is highly compatible with Oracle, and supports SQL firewall, automatic index recommendation, performance insight, resource isolation and other intelligent driving features, supports the cloud native ability of cold and hot separation, solves the problem of enterprise de-O, and helps enterprises to go to O quickly.

POLARDB v2.0 for PostgreSQL, fully compatible with PostgreSQL, supports parallel computing, mixed load, GIS space-time and other multi-mode computing, and has the native cloud capability of separating hot and cold. It is a good choice for cloud on the core databases of enterprise customers (party, government, military, medical, new retail, new manufacturing, scientific research institutions, finance, Internet of things, transportation, aviation, maps, meteorology, surveying and mapping, LBS, territory, GIS and other professional areas).

Application method for public test: https://page.aliyun.com/form/act977150651/index.htm

Cloud database POLARDB: https://www.aliyun.com/product/polardb

Release Core, scenario, benefits, access, and more: https://promotion.aliyun.com/ntms/act/polardbfororacle.html

Aliyun new product launch channel: https://promotion.aliyun.com/ntms/act/cloud/product.html

Aliyun New Product launch Weekly: https://yq.aliyun.com/articles/705813

Original address: https://yq.aliyun.com/articles/705932

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.