Open source | performance optimization weapon: selection and practice of database audit platform Themis 07/02 Update SLTechnology News&Howtos

Open source | performance optimization weapon: selection and practice of database audit platform Themis

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Author: Han Feng

Provenance: DBAplus community sharing

Themis open source address: https://github.com/CreditEaseDBA

Extended reading: reliable open source | Rule analysis and deployment strategy of database audit software Themis

[technology Salon 002] data Center: construction practice of Credit Agile data Center | Credit Technology Salon will be broadcast live online at 8: 00 p.m. on May 23. Click to sign up.

1. Challenges faced 1. Scale and types of operation and maintenance

I believe that these are some of the problems that many companies and many DBA are facing or will face in the future. It is the existing problems that prompt us to consider introducing the database audit platform.

The first is the contradiction between the scale of operation and maintenance and human resources. From our point of view, the operation and maintenance includes four types of databases, including Oracle, MySQL, MongoDB and Redis, with dozens of databases in scale, supporting more than 1000 developers and hundreds of business systems. Some friends may ask, in terms of the scale of operation and maintenance, it is not very large.

Indeed, compared with many Internet companies, the estimate of dozens of databases is not too large; but unlike Internet companies, financial companies such as Yixin are more dependent on databases, and a large number of applications focus on databases. And its use is far more complex than that of the Internet. In addition to the daily operation and maintenance (which we are also improving the efficiency of operation and maintenance through the self-developed platform), DBA also needs a lot of energy to deal with database design, development and optimization. This contradiction becomes even more prominent when faced with a large number of development teams that need services.

2. Case

Structural design.

The second challenge is the uneven quality of database design and development. The picture above shows a structural design problem. In the core table of a core system, 28% of the SQL running by this system is related to this object. When we analyze its structure, we find many problems:

The scale of the table is very large, and the split logic (such as split database, table, partition design) is not taken into account from the beginning of the design, and there is no necessary database cleaning and archiving strategy. There are more than 100 fields in the table, the number of fields is large and the characteristics of different fields are not consistent, so the necessary design of dismantling the table is not taken into account. The table has 13 indexes, which is too many. Excessive indexing of the table is bound to affect its DML efficiency. There is also an index that is found in continuous monitoring that has never been used. This is obviously a "redundant" index. There are also two fields with duplicate indexes, which also shows that it is relatively random at the beginning of indexing. The definition length of a single record is more than 5800 bytes, but the actual average storage length is less than 400 bytes, and the maximum length is not long. After analyzing the contents of the fields, it is also found that there are three field type definition exceptions. That is, you do not use the proper type to save the data, such as using a numeric type to save the date.

To sum up, there are many problems with this table design, and this table is very important, and a large number of statement accesses are related to it.

SQL statement

The figure above shows the efficiency of a statement. Literally, the two tables make an association query, but no association condition is specified when the condition is specified. As you can see in the following execution plan, the database runs in a Cartesian way. From the back of the cost, estimated time, etc., we can see how a "huge" SQL. The impact of its online operation can be imagined.

Some people may say that this is a man-made mistake and generally will not happen. But what I want to say is, first, human error is inevitable, and no one can guarantee the running quality of writing SQL; second, developers have different understanding of the database, so it is difficult to guarantee that the written SQL is efficient; third, developers are faced with a large number of business requirements and often deal with rush status, so it is difficult to devote more energy to optimization. Because of these problems, the quality of online statement execution has become one of the challenges that DBA often faces.

3. Transfer of center of gravity

This is a classic diagram that describes the functional division of work related to the database. As a DBA, in addition to the above challenges, from the development stage of database work and its own development needs, it is also faced with a shift of focus: the operation and maintenance functions of the original traditional DBA are gradually weakened, the emergence of a large number of tools and platforms and the improvement of database self-operation and maintenance capabilities simplify the work of DBA; the database architecture, structure design and SQL quality optimization that follow gradually become the focus Some companies pay more and more attention to the work of data governance and modeling at the upper level. Thus it can be seen that the center of DBA's future work is also gradually moving up. For the middle data logic structure, we also need some tools and platforms to better support the work of DBA.

In addition to the above situation, there are several kinds of imbalances in our company.

From the daily work of DBA, the traditional operation and maintenance work still accounts for a large proportion, while the architecture optimization class is relatively few. Through the introduction of this platform, it can help DBA to carry out architecture and optimization work more easily.

Companies use more commercial products, while open source uses less. From the company's long-term strategy, open source products will be used more and more. From a functional point of view, commercial products have advantages over open source products. Software development based on open source products requires higher technical skills of developers. It is hoped that through the introduction of this product, this transformation process can be more easily completed.

Before there is no platform, DBA still designs and optimizes the database by hand, which is very inefficient. Especially in the face of many product lines, many development teams, often feel inadequate.

The company's own team personnel, or mainly primary and intermediate, middle and senior personnel are relatively few. How to quickly improve the overall design and optimization ability to ensure a unified optimization effect has become a problem in front of us.

It is the various imbalances mentioned above that prompt us to consider introducing tools and platforms to solve database quality problems.

When I first came to the company, I saw these problems in the company, and I also considered solving them in the form of systems and norms. From the very beginning, we started to formulate a lot of norms, and then went to train and preach in various departments. After running in this way for a period of time, some problems are exposed:

The overall effect improvement is not obvious. The implementation effect depends on the attention of each department and the personal ability of the staff.

The landing effect of the specification can not be measured, and it is difficult to achieve quantitative analysis. Often it can only be intuitively perceived through the results of online operation.

Lack of long-term and effective tracking mechanism. It is impossible to track the running quality of a specific system for a long time.

From the perspective of DBA, in the face of a large number of systems, it is difficult to review its structural design and SQL operation quality in detail according to each specification.

In the face of these challenges and existing problems, how to solve them?

After discussion, it is agreed that the introduction of database audit platform can help solve the problems mentioned above.

2. Selection of platform 1. Industry practice

At the beginning of the project, I examined how database audits are conducted by other enterprises in the industry, which can be roughly divided into three ideas:

The first category is the Internet companies represented by BAT. Through the self-developed SQL engine, they can realize cost analysis, automatic audit, access diversion, current restriction, etc., and can do prior audit and automatic audit. However, the technical difficulty is great, and the existing technical capacity of the company is obviously insufficient.

The second category is to collect the operation of DB through self-developed tools, audit according to the rules defined in advance, combined with manual operation to complete the whole audit process. This scheme can only be audited afterwards, but it is less technically difficult and flexible. Its core is the formulation of the rule set, which can be expanded flexibly according to the situation.

The third category is some commercial products, the realization idea is similar to the second category, but with some independent analysis ability, the function is more powerful, but it still needs manual intervention and a lot of capital investment. And examine several commercial products, none of them can fully meet the required functions.

Combining the above practices, it is finally determined that we adopt the way of "tool + manual audit" to develop our own audit platform.

2. Our choice-- self-research

At the beginning of launching the R & D platform, we reached some consensus within the team.

DBA needs to reverse the traditional thinking of operation and maintenance, and everyone is involved in the development of the platform.

Some of the content we accumulated in the past (such as the norms developed earlier) can be precipitated as a knowledge base and standardized, which paves the way for the formulation of rules in the later stage.

In the promotion of the platform, starting with the simplest part, the developed ones will be implemented online to observe the effect; according to the implementation effect, the following work will be constantly revised.

According to our own characteristics, customize the goal; for some of the more complex parts, we can decisively postpone or even give up.

Refer to the design ideas of other companies or commercial products and introduce them boldly. III. Audit platform practice

Let's take a look at the basic functions and implementation principles and methods of the audit platform, which is the focus of this sharing.

1. Platform positioning

At the beginning of the project, we described the positioning of the platform:

The core competence of the platform is to quickly find database design and SQL quality problems.

The platform only does post-audit, and the independent optimization part is implemented in the second phase. Of course, the introduction of this in the project design phase can also play a part of the function of prior review.

All the work is done through the Web interface, and the main users are DBA and R & D personnel with a certain database foundation.

Can be audited for a user, including data structure, SQL text, SQL execution characteristics, SQL execution plan and other dimensions.

Audit results are provided in the form of a Web page or an exported file.

The platform needs to support the company's mainstream Oracle and MySQL, and other databases will be implemented in the second phase.

Try to provide flexible customization capabilities to facilitate future expansion of functions. 2. Platform users

As the two main users of the platform, both developers and DBA can benefit from the platform.

For R & D personnel, only using this platform can facilitate the positioning of problems and timely modification; in addition, through the mastery of the rules, they can also guide their design and development work.

For DBA, it can quickly grasp the overall situation of multiple systems, screen out inefficient SQL in batches, and quickly diagnose general problems through the information provided by the platform. 3. Realization principle

The basic implementation principle of the whole platform is very simple, that is, our audit objects (currently supported four) are filtered through the rule set. The audit objects that comply with the rules are suspected to be problematic. The platform will provide these questions and related information for manual screening. Thus it can be seen that whether the platform is powerful or not mainly depends on the richness of the rule set. The platform also provides some extensibility to facilitate the expansion of rule sets.

4. Platform design

Audit object

Before we begin to introduce the implementation of the platform, let's familiarize ourselves with the concept of "audit object". At present, there are four types of objects that we support, which are explained separately.

Object level. The objects mentioned here refer to database objects, common tables, partitions, indexes, views, triggers, and so on. Typical rules, such as unpartitioned large tables, etc.

Statement level. The statement level here actually refers to the SQL statement text itself. Typical rules, such as multi-table associations.

Execute the plan level. This refers to the execution plan of the SQL in the database. Typical rules, such as full table scan of large tables.

Perform the feature level. This refers to the actual execution of the statement on the database. Typical rules, such as a low ratio of scan blocks to returned records.

It should be noted that of these four types of audit objects, the latter three must be crawled after the system is online, and the first can be run only with data structures (individual rules also need data).

In addition, in the above rules, except for the second type of general rules, the other rules are related to specific databases. That is, each kind of database has its own different rules.

Architecture diagram

Here is a schematic diagram of the system architecture. Let me explain it briefly.

The box part of the figure is the main module of the platform. Modules with different background colors indicate that the current progress status is different. The dotted line represents the data flow and the solid line represents the control flow. Its core is these modules:

Data acquisition module. It is responsible for fetching the basic data needed for auditing from the data source. Currently, fetching from Oracle and MySQL is supported.

OBJ/SQL repository. This is the common storage part of the system, where the collected data and the intermediate data and result data in the process of processing are stored here. Its core data is divided into object class and SQL class. MongoDB is used in physics.

Core management module. The dotted line on the right side of the figure contains two modules: SQL management and OBJ management. It mainly completes the whole life cycle management of the object. At present, only a simple object filtering function has been done, so it is still a white background, and the core function has not yet been completed.

Audit rules and audit engine module. This is the core component of the first phase of the platform. The audit rule module is to complete the definition and configuration of the rules. The audit engine module is the audit executive part that completes the specific rules.

Optimize rules and optimize engine modules. This part is the core component of the second phase of the platform. It has not been developed yet, so it has a white background.

System management module. This part is to complete the basic functions of the platform, such as task scheduling, space management, audit report generation, export and other functions.

Flow chart

Let's look at the overall process of the platform from the perspective of the process flow.

1) the "rule management" part, which mainly completes some of the following functions.

Initialize the rule. The platform itself has many built-in rules, which are imported into the configuration repository in the process.

Add new rules. The platform itself provides some extensibility, and a new rule can be added according to the specification.

Modify the rules. You can turn rules on or off according to your own situation. For each rule, there are also some built-in parameters that can also be modified here. In addition, for violations of the rules, you can also set a deduction method (for example, a violation of a few points, up to a few points), and so on.

The rule itself and related parameters, configuration information, and so on are stored in the configuration repository.

2) the "task management" part, which is a part of background management, mainly completes the work related to the task. Most of the interactions in the system are done asynchronously through jobs. The background is implemented through celery+flower.

3) in the part of "data acquisition", this part is completed through task scheduling, and a small number of parts are completed by querying the online library in real time. The collected results are saved in the database and can be called by the subsequent analysis part.

4) the "rule parsing" part, which is triggered by the user through the interface, and the task scheduling module starts a background asynchronous task to complete the parsing work. The reason why it is designed to be completed asynchronously is mainly due to the fact that the audit work may take a long time (especially when there are more audit categories, more audit objects, and more audit rules open). The audit results are saved in the database.

5) in the "Task View and Export" section, after the user initiates the audit task, you can check the progress (whether the audit is in progress or the audit is completed). When the audit is completed, you can select the audit task, browse the audit results or select export. If you choose to export, an asynchronous background job generation file is generated and placed on the download server.

The above is the general process of the audit. You will see the details of each section later.

Module partition

To sum up, the platform is mainly composed of the above four modules: data collection, rule analysis, system management, results display. Later, the implementation of different modules will be described in detail.

5. Data acquisition

Collect content

Let's first take a look at the data acquisition module. As can be seen from the table, the collection contents of the two types of databases are different.

Oracle provides a wealth of information, which can be collected basically; the MySQL function can collect relatively less information.

The "check mark + asterisk" in the table indicates that the unscheduled job is completed, but is fetched back to the library in real time. The following is a brief description of the collection of each part.

At the object level, the object statistical information, storage characteristics, structure information and access characteristics are collected.

SQL level, collecting SQL text, execution plan, cache cursors, binding variables, execution characteristics, and so on.

This information will be used as the basis for later review.

Collection principle

The following is a brief introduction to the collection and principle:

The Oracle part is the AWR data collected by scheduled jobs and then dumped into a set of MongoDB. Unlike some similar products, it does not collect data directly in memory, but from offline data. The aim is to minimize the impact on online operation. Oracle provides a wealth of functions, through the access to AWR and data dictionary, you can basically get all the data.

In the MySQL part, the situation is a little more complicated because it is not so rich in functionality. Multiple types of data are obtained from different sources. The SQL text class and the execution feature class are imported into the Anemometer platform library regularly through the analysis of the slow query log by the pt tool, and then MongoDB is imported from this library. Other types of information (including data dictionary classes, execution plan classes, etc.) are queried back to the library in real time when needed. In order to prevent the master library from being affected, it is generally obtained by routing to execution on the slave library. 6. Rule parsing

Summary description

The following introduces the most core part of the whole system-rule parsing module, which completes the function of auditing the collected data and screening out the data that violate the rules according to the definition rules. The screened data are scored and recorded for subsequent generation of audit reports. At the same time, additional information will be recorded to assist in some judgment work.

Here is a core concept-"rules". Later, you can see the definition of a built-in rule, and you will be more clear. From the perspective of classification, it can be roughly divided into the following categories.

From the point of view of database type, rules can be divided into Oracle and MySQL. Not all rules distinguish between databases, and rules for text classes do not.

In terms of complexity, rules can be divided into simple rules and complex rules. The simplicity and complexity mentioned here actually refers to the implementation part of the rule audit. Simple rules are a set of query statements that can be described as MongoDB or relational databases, while complex rules need to be implemented externally through the program body.

From the point of view of audit objects, rules can be divided into object class, text class, execution plan class and execution feature class. The following will be explained separately for each type of audit object.

Rule definition

This is a rule body of the declaration object, I explain the meaning of each field, we can also have a clear understanding of the rules.

Db_type: the database category of the rule, which supports Oracle and MySQL.

Input_parms: enter parameters. The rule is that multiple output parameters can be defined, which is a list of parameters, and each parameter itself is a dictionary class that describes all kinds of information about the parameters.

Output_parms: output parameters. Similar to the input parameters above, it is also a list of dictionary objects. Describes the structure of returning information according to rules.

Rule_complexity: rules are complex rules or simple rules. If it is a simple rule, then directly take the rule_cmd content as the implementation of the rule audit. If it is a complex rule, the rule implementation is obtained from an externally defined rule_name command script.

Rule_cmd: the implementation part of the rule. The rule may be a query statement of mongodb or a regular expression, depending on rule_type.

Rule_desc: rule description, for display only.

Rule_name: rule name. Is the unique identity of the rule, globally unique.

Rule_status: rule status, ON or OFF. For a closed rule, it is ignored during audit.

Rule_summary: a field to be discarded with the same meaning as rule_desc.

Rule_text: rule type, which can be divided into four categories: object, text, execution plan and execution characteristics. The example in the figure identifies a rule of text type, and rule_cmd is a regular expression.

Solution: the optimization recommendation that triggers this rule.

Weight: weight, that is, the deduction system for a single violation of the rule.

Max_score: the upper limit of deduction. Set this parameter to avoid violating a rule and having too much impact.

Rule definition (object level)

Let's take a look at the first kind of rules-object rules. This is a set of rules set for database objects. The table above shows some examples. Common objects, such as tables, partitions, indexes, fields, functions, stored procedures, triggers, constraints, sequences, and so on, are audit objects. Take the table as an example, there are a lot of built-in rules.

For example, the first one has too many big watches. Indicates that the number of large tables in a database exceeds the rule definition threshold. The large table here is determined by the rule input parameters, including the number of table records and the physical size of the table. The overall description of this rule is "if the number of tables in the database that exceed the specified size or the specified number of records exceeds the specified threshold, the audit rule will be triggered." The rules for other objects are similar.

Rule implementation (object level)

The implementation of object rules is relatively simple. Except for individual rules, the data dictionary information is basically queried, and then judged according to the definition of the rules. The above example is a rule implementation of an index to query data dictionary information.

Rule definition (execution plan level)

The second type of rule is the rule of the execution plan class, which is also divided into several categories. For example, access path class, inter-table association class, type conversion class, binding variable class, and so on.

Take the most common access path class as an example to illustrate. Such as the most common rule "big table scan". It indicates that during the execution of the SQL statement, the access to the large table is performed, and the access path is in the form of a full table scan. The input parameters of this rule contain the definition of the large table (physical size or number of records), while the output part includes the table name, table size and additional information (including the entire execution plan, statistics for the specified large table, etc.).

Such rules are aimed at data sources that are crawled from online databases. The Oracle part is extracted directly from AWR by time period, and the MySQL part is obtained by checking the database using the explain command.

Information storage format

In particular, MongoDB is used as a documentation database when saving the execution plan. The purpose is to make use of its schemaless feature to facilitate the compatibility of different databases and different versions of the execution plan. All can be saved in a collection, and subsequent rule audits are also implemented using the query statements in mongo. This is also the original intention of the introduction of mongo, and other types of information will be put into the library later. Now the entire audit platform, except for the part connected to the pt tool using MySQL, the rest is in MongoDB. In addition, the MySQL library can directly output the execution plan in json format, which is very convenient to enter the library; the Oracle part also forms the json format into the library.

Rule implementation (execution plan)

On the left is what an Oracle execution plan looks like in MongoDB. It is actually inserting sqlplan dictionary data into mongo. On the right is an example of a rule implementation, a query statement based on mongo. We will see a detailed example later.

7. Implementation of the platform

Rule realization

Here, take the "full table scan of large table" rule as an example to illustrate. The above is the execution plan saved in the data dictionary in Oracle, and the following is stored in Mongo. It can be seen that it is completely copied down.

Based on this structure, how to implement rule filtering? In fact, it is realized through the find statement in mongo. The following is a specific interpretation of the execution steps of this statement.

The top find () section is used to filter the execution plan. Filter out the execution plan that meets the specified user, time range, and access path ("TABLE ACCESS" + "FULL").

The filtered parts will associate the object data and filter out the parts that meet the "large table" criteria. The large table rule is that the number of records is greater than the specified parameter or the physical size is larger than the specified parameter.

The sql_id, plan_hash_value and object_name information of the preservation period will be returned for the results obtained. The three information will be used to extract SQL statement information, execution plan information and associated object information respectively.

All the result sets obtained will be deducted in accordance with the previously established deduction principle.

The extracted three-part information + deduction information will be returned as a result and displayed at the front end.

Rule implementation (execution plan)

This section is an example of implementing hierarchical result storage in MySQL.

The first diagram shows the original execution plan.

The second figure is a summary of the code implementation.

The third picture is what it actually looks like in the library. The core part is the generation of item_level.

Rule definition (text level)

The third kind of rule is the rule of text class, which is a kind of rule that has nothing to do with the type of database and describes the text characteristics of SQL statements. In the implementation, it is processed by text regular matching or program way. Its main purpose is to standardize developers' SQL writing, and to avoid complex, poor performance, non-standard SQL writing.

Rule implementation (text level)

This part describes the implementation of text rules. The first example, bad_join, is a simple rule implemented through regular text matching. The second example, sub_query, completes the judgment of subqueries (or multi-level subqueries) by judging the nesting of parentheses by the program.

Rule definition (execution feature level)

The last kind of rule is to execute the feature class. This part is closely related to the database, and the statements that meet certain execution characteristics are screened out. These statements are not necessarily inefficient, but may only be the focus of optimization in the future, or some of the statements with the highest benefits. This is mainly about the consumption of resources, and so on.

8. System management

Rule management

Later, through some interface display, introduce the function of the platform.

The first part is the part of rule management in the system management module. In this section, you can add your own rules. Its core is the rule implementation part, which defines the rule implementation body in the form of SQL statement, Mongo query statement and custom Python file. The custom rules are based on the existing fetched data sources, and the definer needs to be familiar with the existing data structure and meaning. Currently, custom crawling data sources are not supported.

For the defined rules, you can modify the rules here. It is mainly to configure the rule status, threshold, deduction points and so on.

Task management

After you have configured the rules, you can complete the task publishing here.

The above is the interface for publishing rule tasks. After selecting the data source (ip, port, schema), select the audit type and audit date. Currently, the timing policy of the audit data source is still in days, so the date cannot be selected.

When the task is released, you can observe the implementation in the task results view interface. Depending on the type of audit, the number of data source objects, the number of statements, etc., the duration of the audit varies, usually within 5 minutes. When the audit job status is successful, it means that the audit job is complete and you can view or export the audit results.

9. Result presentation

Overview of object audit results

The figure above is an example of an object audit report. At the beginning of the report is an overview page. It focuses on showing the various rules and deductions in the audit report, and shows its proportion through a pie chart. This makes it easier for us to concentrate on the core issues first.

At the top, you can also observe a display of the total score of the rule. This is a score we get after we convert the rule points on a percentile basis. The higher the score, the fewer violations and the higher the quality of the audit object. The introduction of the "total score of rules" is somewhat controversial at the beginning of the design, fearing that this indicator will dampen the enthusiasm of developers and is not conducive to the promotion and use of the platform. Here are a few points to explain.

The total score of rules is introduced to digitize the design, development and operation quality of the database. In the past, in many optimization, it is difficult to quantify the effect before and after optimization. Here is a way to make a comparison. This approach may not be very scientific, but it provides a quantifiable means after all.

There are great differences among different business systems, so there is no need to make a horizontal comparison. A system 60 points, B system 50 points, does not mean that the quality of An is higher than that of B.

A single system can do more vertical comparison, that is, to compare the total score of rules before and after transformation and optimization. It can reflect the change of system quality to a certain extent.

The total score of rules has a lot to do with the configuration of rules. If you close the rule or lower the threshold for violating the rule, you will get a higher score. This should be determined according to the situation of the system itself. The threshold of the same rule can be different for different systems. For example, in the application of data warehouse, scanning all large tables is a relatively normal behavior. You can consider turning off this rule or lowering the threshold for a single violation and the upper limit of total deductions.

Details of object audit results

This section is the details of the object audit, corresponding to the details of each rule, you can further view the object information in the link on the left. Space is limited, so there will be no more display.

Overview of the audit results of the implementation plan

An overview of this part of the execution plan is shown, similar to the situation of the object. It is also the deduction of points for each rule.

Details of the audit results of the implementation plan

This part is the details of the implementation plan.

After expansion, you can see the details of violations of each rule. The image above is the detail section that violates the rules of the full table scan.

Above are some general solution descriptions. The situations and solutions that may trigger such rules are described here. It is equivalent to a small knowledge base and is easy for developers to optimize. Later, in the second phase of the platform, we will do a more accurate part of the optimization engine, which will be carried out.

The following is each violation of the statement, we can see the statement text, execution plan, association information (such as the large table name of such a rule), and so on. You can also further click on the statement to expand the information.

This section is information for each SQL, including statement text, execution plan, execution characteristics, associated object statistics, and so on. DBA can make some preliminary optimization judgments from this information.

In addition, the platform also provides export capabilities. Can be exported as an excel file for users to download and view. This is where it shows.

10. The pit we encountered

In the actual development process, encountered a lot of problems. Here we briefly introduce two, such as:

The problems exposed by MySQL in parsing the json format execution plan.

[session enters sleep state, fake death]

Solution: set the wait_timtout=3 before executing the session, and this time is adjusted according to the actual situation.

[the amount of data is too large and there is no result for a long time]

The session is in the query state, but the amount of data is very large or because the database does not support format=json very well and cannot be parsed for a long time, it will affect other sessions.

Solution: use the pt-kill tool to kill the session. To prevent manslaughter, type the mark "eXplAin format=json" and then use pt-kill to identify the eXplAin keyword.

11. Advance the process

Since its operation in Yixin Company, this platform has provided audit reports for many systems, greatly accelerated the speed of database structure and SQL optimization, and reduced the daily work pressure of DBA. In the process of implementing the work, we have also explored a set of implementation methods. After the platform has been open source, if friends use it, you can refer to the implementation.

Information gathering stage

Massive collection of the operation of the company's database system, grasp first-hand information. Quickly understand the quality of each business system and do a good job in pilot selection.

Manual analysis stage

Key system, manual intervention analysis. According to the core problems exposed in the rule audit, the targeted analysis and optimization report are given.

Communication training stage

Take the initiative to visit, communicate with the development team and report the situation. Take the opportunity of the analysis report, the development team can carry out the necessary training work, combined with the cases around them, more persuasive.

Feedback improvement stage

Implement the results of communication and urge them to improve. Improve quality through regular feedback from audit platform. There is a basic team that can develop a platform for developers to use. Making SQL quality issues is no longer just a DBA problem, but also related to everyone in the project.

Content source: Yixin Institute of Technology

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.