In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "what are the major new functions of MySQL". The explanation content in this article is simple and clear, and it is easy to learn and understand. Please follow the ideas of Xiaobian to study and learn "what are the major new functions of MySQL" together.
I. MySQL's natural short board "data analysis"
MySQL, as one of the most popular open source databases, has been widely used. As can be seen from the latest db-engines index, it occupies an important position in the database field.
However, as such a popular database product, there is an obvious shortcoming is data analysis. I believe MySQL users have this feeling, MySQL is somewhat inadequate in large-scale data situations. Although its kernel is also constantly improving its capabilities in this area (such as the latest 8.0 support hash join, histogram, etc.), it still has certain disadvantages compared to other databases. At its core, MySQL was designed natively for OLTP scenarios and did not consider OLAP scenarios.
Although some vendors have expanded their storage engines to enhance their data analytics capabilities, overall they are still unsatisfactory. Thus, a common option is to migrate data to another database/big data architecture when analyzing data. Through the latter, to complete the data analysis work. In this process, developers need to define their own ETL logic (possibly based on log parsing or logical data extraction) to complete the data migration. At the same time, it is necessary to pay attention to data synchronization at all times during operation to ensure data validity. In terms of usage, separate statements (statements from heterogeneous data sources) need to be written to complete the data analysis action.
The whole process undoubtedly has a certain threshold for users, and it takes extra effort to solve this problem. I believe that Oracle original factory also saw this problem, so it launched MySQL Analytics Engine.
2 What is MAE?
MySQL Analytics Engine (MAE) is a built-in analytics engine. Combined with MySQL Database, it enables database administrators and application developers to directly use MySQL databases as a unified service for OLTP and OLAP workloads. MySQL Database Service with Analytics Engine consists of a MySQL DB instance and multiple analytics nodes. When analytics is enabled, analytics services are installed on DB instances and are responsible for cluster management, data loading, query execution, etc. From the user's point of view, it can be connected using the standard MySQL JDBC/ODBC Connector. Let's take a closer look.
First of all, from the overall use point of view, to provide a unified MySQL Database Service. Users still use it in traditional ways, whether it's OLTP or OLAP scenarios. Inside this service, there is a traditional OLTP engine and a new Analytics engine. In terms of data, there are also two copies. The former is still stored in a storage engine such as InnoDB, while the latter is stored in the memory of a cluster of nodes. Changes to the data caused by normal transaction operations are transparently propagated to subsequent analysis clusters to accelerate analysis processing. This allows customers to run OLTP and OLAP workloads simultaneously on a single database platform. In short, two compute engines, two data stores.
The core working idea of MAE is to "reduce the size". Through the partition mechanism, the data is scattered and processed by independent CPU resources. The results of the processing are returned uniformly.
Within MAE, it consists of multiple analysis nodes. The exact number of nodes can be automatically obtained by the automatic configuration consultant provided by MySQL analysis engine. In nodes, data is stored in a format with mixed column compression. This helps vectorization, resulting in very good query performance. Data is encoded and compressed before it is run in memory. This compression and optimized memory usage, especially for numeric and string data, improves performance and reduces memory footprint, resulting in lower costs for customers. Parallel operation techniques are also used in each node, which provides high cache hit rates for analysis and good scalability between nodes. Each analysis node within a cluster and each core within a node can process partitioned data in parallel, including parallel scans, joins, grouping, aggregation, and top-k processing.
The MySQL analysis engine implements the latest algorithms for distributed memory analysis processing. By using vectorized build and probe connection kernels, connections within partitions can be processed quickly. Network communication between analysis nodes is optimized by using asynchronous batch I/O. The algorithm is designed to overlap computation time with data traffic across nodes, which helps achieve good scalability.
The integration of MySQL analytics engine with MySQL database services provides a single data management platform for all OLTP and analytics needs of an enterprise. The MySQL analysis engine is designed as a MySQL pluggable storage engine that completely masks all the underlying implementation details of the storage layer from the end user. Users and applications interact with MySQL analytics through MySQL database nodes in the cluster. Users connect to the MySQL analysis engine through standard tools and standards-based ODBC/JDBC connectors.
MySQL analysis engine supports the same ANSI SQL standard and ACID attributes as MySQL, and supports different data types. This allows existing applications to leverage the MySQL analysis engine without making any changes to their applications, enabling simple and fast integration. Once a user submits a query to the MySQL database, the MySQL query optimizer transparently decides whether to offload the query to an analysis cluster to speed up execution. This is based on whether the MySQL analysis engine supports all operators and functions referenced in the query and when it takes less estimated time to process the query using the analysis engine than in MySQL, the query will be pushed down to the analysis node for processing. After processing, the results are sent back to the MySQL database node and returned to the user.
Since MySQL analysis engine is an in-memory processing engine, data is persisted to MySQL InnoDB storage engine. Any updates to the table are automatically propagated to the memory of the analysis node in real time. This enables subsequent queries to always have access to the most recent data. This is done in the background by a lightweight change propagation algorithm that keeps up with MySQL's data update rate. Synchronization principle not explained?
What can MAE do for us?
From a product perspective, MySQL Analytics Engine is a cloud-native service delivered exclusively within Oracle Cloud Infrastructure that provides compelling performance and cost for analytics workloads. Organizations that use MySQL databases to manage enterprise data can now run analytical queries using MySQL Analytics Engine with significantly improved performance, lower cost, no ETL required, and support for real-time analytics. This service can be deployed only in the cloud or in hybrid environments, simplifying the management of transactional and analytical applications. From its launch products, it can be summarized into the following concepts:
ONE DB
The core value point of this product is unified processing of scenarios. Before MySQL was able to handle mixed business scenarios, it had to use the following methods (as shown below):
Through the introduction of MAE, MySQL capabilities are enhanced to achieve uniform results.
ONE SQL
The second value point is not only to unify the platform, but also to use the same way of interaction. The original logic does not need to be changed. No change is required for both front-end transaction scenarios and back-end data visualization analysis. This is undoubtedly attractive to users and can effectively protect users 'existing software assets.
NO ETL
For the user, there is no need to care about the details of data synchronization and no need to write additional ETL jobs. This will greatly reduce the burden on users.
NO TUNING
Before MAE, there were only two ways users could solve data analytics problems. Either solve it inside the library or solve it outside the library. Resolving it in-library requires a lot of optimization work on MySQL or a storage engine for analysis scenarios, which brings with it an optimized workload. The same is true for out-of-library scenarios, where users need to do the optimization themselves.
MAE does not need to worry about this. It leverages automated machine learning (AutoML) capabilities developed by Oracle Labs to automate all aspects of the service. Because this automation is based on machine learning, the system can intelligently predict various scenarios and take actions, including automatically estimating the number of analytics nodes required for the workload.
When the service starts, the database tables that run the analysis queries need to be loaded into MySQL analysis cluster memory. The size of the cluster required depends on the tables and columns required to load, as well as the compression of this data in memory. In a traditional configuration, the user needs to guess the size of the cluster. Because of space constraints, underestimation can cause data loads or query execution to fail. Overestimation results in additional costs for unnecessary resources. Therefore, users iterate until they determine the correct cluster size, which becomes inaccurate when the table is updated. MAE can automatically complete the above process.
HIGH PERFORMANCE
MAE is a distributed, scalable, in-memory, hybrid column-based query engine that speeds performance through vectorization in memory and massive inter-and intra-node parallel processing. Query processing has also been optimized for Oracle cloud infrastructure, including inter-node network bandwidth optimization. Through these capabilities, MAE provides powerful analytical capabilities. Below is a comparison of performance against its main competitors.
MySQL Analytics vs MySQL
MySQL Analytics vs Amazon Aurora
MySQL Analytics vs Amazon Redshift
LOW COST
The cost of using MySQL database services with an analysis engine depends on the number of analysis nodes configured. The size of the analytics cluster depends on the size of the dataset and the characteristics of the workload. An analysis node can hold approximately 400GB of data. When customers migrate to MySQL database services using Analytics Engine, their costs are expected to decrease significantly. Compared to Amazon Aurora and Redshift, MySQL analytics engines cost one-third as much.
EASY SCALE
MAE currently supports up to 24 analytics nodes (thousands of cores) in a cluster with a processing capacity of approximately 10 TB of analytics data. 10TB is the approximate amount of data that can fill the memory of an analysis node at a given time. There is no limit to the amount of data stored in the MySQL database, and customers can choose which tables or columns to load from the MySQL database schema into the memory of the analysis node. If the query no longer requires these tables, users can delete them from memory to make room for other data.
EASY USE
MAE hides a lot of details behind and is very simple to use for front-end customers. All you need to do is configure the analysis cluster according to the recommended size, configure the objects for pre-accelerated queries, and manually complete the first load. Later, you can enjoy the acceleration brought by the analysis cluster.
By looking at the execution plan, you can visually see if analysis clusters are used (Using secondary engine RAPID in the figure below)
In addition, for hybrid cloud scenarios (i.e. customers who cannot deploy data to the cloud), MySQL replication can be leveraged to replicate local MySQL data to MySQL analytics engines, again without ETL. MAE is currently only available on Oracle Cloud Infrastructure (OCI) Gen 2 hardware platforms.
written in the end
The appearance of MAE makes up for MySQL's shortcomings in data analysis scenarios. We believe this feature will expand MySQL's use even further, paving the way for larger, more important enterprise applications. Unfortunately, MAE can only be used in OCI at present, and the majority of offline users in China cannot use it.
Thank you for your reading. The above is the content of "What are the major new functions of MySQL?" After studying this article, I believe everyone has a deeper understanding of what major new functions MySQL has. The specific use situation still needs to be verified by practice. Here is, Xiaobian will push more articles related to knowledge points for everyone, welcome to pay attention!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.