Detailed explanation of Mysql interview questions of the most detailed front-line factory in history 07/19 Update SLTechnology News&Howtos

Detailed explanation of Mysql interview questions of the most detailed front-line factory in history

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

1. The replication principle and process of MySQL

Fundamentals, process, 3 threads and relationships between them

Master: binlog thread-record all statements that change the database data and put them in binlog on master

From: io thread-after using start slave, it is responsible for pulling binlog content from master and putting it into its own relay log

Execute thread from: sql-- execute statements in relay log

2. The difference between myisam and innodb in MySQL, at least 5 points.

(1) the question is different at 5 points

1 > .InnoDB supports things, but MyISAM does not support things

2 >. InnoDB supports row-level locks, while MyISAM supports table-level locks.

3 > .InnoDB supports MVCC, but MyISAM does not

4 > .InnoDB supports foreign keys, but MyISAM does not

5 > .InnoDB does not support full-text indexing, but MyISAM does.

(2), four major features of innodb engine

Insert buffer (insert buffer), second write (double write), adaptive hash index (ahi), pre-read (read ahead)

(3), selectcount (*) which is faster and why?

Myisam is faster because myisam maintains a counter internally that can be fetched directly.

3. The difference between varchar and char in MySQL and the meaning of 50 in varchar (50).

(1) the difference between varchar and char

Char is a fixed length type, and varchar is a variable length type.

(2), the meaning of 50 in varchar (50)

With a maximum of 50 characters, varchar (50) and (200) store the same amount of space as hello, but the latter consumes more memory when sorting because order by col uses fixed_length to calculate col length (the same is true of memory engines)

(3) the meaning of 20 in int (20)

Refers to the length of the display character

But to add parameters, the maximum is 255. for example, it is the id for recording the number of rows. If you insert 10 pieces of data, it will display 00000000001 ~ 00000000010. When the number of characters exceeds 11, it will only show 11 digits. If you do not add the parameter that makes it preceded by 0 before 11 digits, it will not be preceded by 0.

20 indicates that the maximum display width is 20, but it still accounts for 4 bytes of storage, and the storage range remains the same.

(4) Why is mysql so designed

It doesn't make sense for most applications, just specify the number of characters that some tools use to display characters; int (1) and int (20) store and calculate the same.

4. Ask how to realize the transaction and log of innodb.

(1) how many kinds of logs are there

Error log: record error messages, as well as warnings or correct information.

Query log: records all requests to the database, regardless of whether they are executed correctly or not.

Slow log: set a threshold to log all SQL statements that run longer than this value in the slow log file.

Binary log: records all actions that make changes to the database.

Relay log:

Transaction log:

(2) four levels of isolation of things

Isolation level

Read unsubmitted (RU)

Read submitted (RC)

Repeatable read (RR)

Serial

(3), how the transaction is realized through the log, the more in-depth the better.

Transaction logs are implemented through redo and innodb storage engine log buffers (Innodb log buffer). When a transaction is started, the lsn (log sequence number) number of the transaction is recorded; when the transaction is executed, the transaction log is inserted into the log cache of the InnoDB storage engine log. When a transaction commits, the log buffer of the storage engine must be written to disk (controlled by innodb_flush_log_at_trx_commit), that is, the log needs to be written before the data is written. This method is called "pre-writing log mode".

5. Several log entry formats of MySQL binlog and their differences

Statement: every sql that modifies the data is recorded in the binlog.

Advantages: no need to record the changes of each line, reduce the number of binlog logs, save IO, and improve performance. (how much performance and log volume can be saved compared with row, which depends on the SQL of the application. The log amount generated by modifying or inserting the row format of the same record is still less than that generated by Statement, but considering that if the update operation with stripes, as well as the whole table deletion, alter table and other operations, ROW format will generate a large number of logs, so when considering whether to use ROW format logs should follow the actual situation of the application. How much more logs will be generated, and the resulting IO performance problems.)

Disadvantages: since only execution statements are recorded, in order for these statements to run correctly on the slave, it is also necessary to record some information about the execution of each statement to ensure that all statements get the same results in slave as they are executed on the masterside. In addition, the replication of mysql, like some specific functions, slave can be consistent with the master will have a lot of related problems (such as sleep () function, last_insert_id (), and user-defined functions (udf) will have problems).

Statements that use the following functions cannot be copied either:

LOAD_FILE ()

UUID ()

USER ()

FOUND_ROWS ()

SYSDATE () (unless the-- sysdate-is-now option is enabled at startup)

At the same time in INSERT... SELECT produces more row-level locks than RBR

2.Row: no sql statement context-sensitive information is recorded, only which record is modified.

Pros: binlog does not record context-sensitive information about executed sql statements, but only needs to record what that record has been modified to. So the log content of rowlevel will clearly record the details of each line of data modification. And there will be no problems that stored procedures, or function, and the calls and triggers of trigger can not be copied correctly in certain cases.

Disadvantages: when all executed statements are recorded in the log, they will be recorded as changes in each row, which may result in a large amount of log content. For example, if a update statement modifies multiple records, each change in binlog will be recorded, resulting in a large amount of binlog logs, especially when executing statements such as alter table, each record will be changed due to table structure changes. Then each record in the table is recorded in the log.

3.Mixedlevel: is the mixed use of the above two kinds of level. General statement modifications use statment format to save binlog. For example, if some functions, statement cannot complete the master-slave copy operation, saving binlog,MySQL in row format will distinguish the log form of records according to each specific sql statement executed, that is, choose one between Statement and Row. The new version of MySQL Squadron row level mode is also optimized. Not all changes are recorded in row level, such as statement mode in the event of table structure changes. Statements that modify data, such as update or delete, still record changes to all rows.

6. What will he do if the MySQL database cpu soars to 500%?

1. List all processes show processlist, observe all processes, and do not change state for many seconds (kill)

2. Check the timeout log or error log (after several years of development, queries and massive insertions usually lead to the rise of cpu and iUnip. Of course, it does not rule out that the network status is suddenly cut off, resulting in only half of a request server accepting it. For example, if the where clause or page clause is not sent, of course, an experience of being trapped)

7. Sql optimizes various methods

(1) the significance of all kinds of item from explain

Select_type

Represents the type of each select clause in the query

Type

Indicates how MySQL finds the required rows in the table, also known as "access type"

Possible_keys

Indicates which index MySQL can use to find rows in the table. If there is an index on the field involved in the query, the index will be listed, but not necessarily used by the query.

Key

Displays the index actually used by MySQL in the query. If no index is used, it is displayed as NULL

Key_len

Represents the number of bytes used in the index, which can be used to calculate the length of the index used in the query

Ref

Indicates the join matching condition of the above table, that is, which columns or constants are used to find values on index columns

Extra

Contains additional information that is not suitable for display in other columns but is very important

(2), the meaning and usage scenario of profile

Query how long SQL will execute, and see how much CPU/Memory is used, how much time Systemlock and Table lock spend during execution, and so on.

8. Backup plan, mysqldump and the implementation principle of xtranbackup

(1) backup plan

Every company here is different, just don't talk about one hour and one everything.

(2) backup recovery time

This has something to do with the speed of the machine, especially the hard disk. Here are a few for reference only

2 minutes of 20g (mysqldump)

30 minutes of 80g (mysqldump)

30 minutes of 111g (mysqldump)

3 hours of 288g (xtra)

4 hours of 3T (xtra)

The logical import time is generally more than 5 times the backup time.

(3) the realization principle of xtrabackup

A redo log file, which we can also call a transaction log file, is maintained within InnoDB. The transaction log stores recorded changes to each InnoDB table data. When InnoDB starts, InnoDB examines the data file and transaction log and performs two steps: it applies (rolls forward) committed transaction logs to the data file, and rolls back data that has been modified but not committed.

9. The sql backed up in mysqldump, if I want to have only one insert on one line in the sql file. What if .value ()? What if the backup needs to bring the replication point information of master?

-- skip-extended-insert

[root@helei-zhuanshu] # mysqldump-uroot-p helei--skip-extended-insert

Enter password:

KEY `idx_ c1` (`c1`)

KEY `idx_ c2` (`c2`)

) ENGINE=InnoDB AUTO_INCREMENT=51 DEFAULT CHARSET=latin1

/ * 40101 SET character_set_client = @ saved_cs_client * /

-- Dumping data for table `helei`

LOCK TABLES `helei` WRITE

/ *! 40000 ALTER TABLE `helei` DISABLE KEYS * /

INSERT INTO `helei` VALUES (1 recorder, 32, 37, 38, 2016-10-18 06-19, 19, 24)

INSERT INTO `helei` VALUES (2pence37, 46, 21)

INSERT INTO `helei` VALUES (3, 21, 5, 14, 2016, 10, 18, 06, 18, 06, 19, 14, 14, 14, 24, 24, 14, 14, 2016, 10, 18, 06, 19, 14, 24, 14, 14, 14, 2016, 10, 18, 06, 19, 24, and 24 months).

10,500 db, restart within the fastest time

Puppet,dsh

11. Optimization of read and write parameters of innodb

(1), read parameters

Global buffer pool and local buffer

(2), write parameters

Innodb_flush_log_at_trx_commit

Innodb_buffer_pool_size

(3) parameters related to IO

Innodb_write_io_threads = 8

Innodb_read_io_threads = 8

Innodb_thread_concurrency = 0

(4), cache parameters and applicable scenarios of cache.

Query cache/query_cache_type

Not all tables are suitable for query cache. The main reason for the failure of query cache is that the corresponding table has changed.

The first one: if there are many read operations, look at the proportion. To put it simply, if it is a user list table, or the proportion of data is relatively fixed, such as a commodity list, it can be opened, as long as these libraries are more centralized and the practice in the database is relatively small.

The second: when we "cheat", for example, when we bid, and turn on query cache, we can still get the effect of qps surge. Of course, the front-end connection pool is configured the same. In most cases, if you write a lot and don't have a lot of visits, then don't open it. For example, on social networking sites, 10% of people generate content and the remaining 90% are consuming. It works well to open it, but if you are qq messages or chat, it will be fatal.

Third: small websites or no high concurrency does not matter, high concurrency, you will see a lot of qcache locks waiting, so generally high concurrency, it is not recommended to open query cache

12. How do you monitor your database? How do you query your slow logs?

There are many monitoring tools, such as zabbix,lepus. I use lepus here.

Have you ever done a master-slave consistency check? if so, how did you do it? if not, what are you going to do?

There are many tools for master-slave consistency verification, such as checksum, mysqldiff, pt-table-checksum, etc.

14. Does your database support emoji emoticons? if not, how to do it?

If it is a utf8 character set, you need to upgrade to utf8_mb4 to support it.

How do you maintain the data dictionary of the database?

The methods we maintain are different. I usually comment directly in the production library and use tools to export to excel for easy circulation.

16. There is a large field X (for example, text type) in the table, and field X will not be updated frequently, mainly for reading.

Problems caused by split: connection consumption + storage split space; problems that may be caused by not splitting: query performance

1. If the space problem caused by the split can be tolerated, it is best to IO the primary keys of the tables that are frequently queried in the physical structure (partition) in order to reduce join consumption. Finally, this is a text column plus a full-text index to offset the join consumption as much as possible.

2. If the query performance loss caused by non-splitting can be tolerated: the above solution will certainly have problems under certain extreme conditions, then not splitting is the best choice.

17. What is the row lock of the InnoDB engine in MySQL accomplished (or implemented) by adding to it? Why is it like this?

InnoDB completes the row lock based on the index

Example: select * from tab_with_index where id = 1 for update

For update can complete row lock locking according to conditions, and id is a column with index keys.

If id is not an index key, then InnoDB will complete the table lock, and concurrency will be out of the question

18. Open-ended question: it is said to be from Tencent

A 600 million table a, a 300 million table b, through external tid association, how can you quickly query the 200 data records from 50000 to 50200 that meet the criteria.

1. If the TID of table An is self-growing and continuous, and the ID of table B is the index

Select * from from b where a.tid = b.id and a.tid > 500000 limit 200

2. If the TID of table An is not contiguous, then you need to use an overlay index. Tid is either a primary key or a secondary index, and table B ID also needs an index.

Select * from b, (select tid from a limit 50000200) a where b.id = a .tid

What is a stored procedure? What are the advantages and disadvantages?

Stored procedures are precompiled SQL statements.

1, a more straightforward understanding: stored procedure can be said to be a recordset, it is a number of T-SQL statements composed of code blocks, these T-SQL statements code like a method to achieve some functions (on a single table or multi-table changes and deletions), and then give this code block a name, in the use of this function when calling him on the line.

2. Stored procedure is a precompiled code block with high execution efficiency. Replacing a large number of T_SQL statements with one stored procedure can reduce network traffic, increase communication rate, and ensure data security to a certain extent.

20. What is the index? What are the functions, advantages and disadvantages?

1. An index is a structure that sorts the values of one or more columns in a database table, and is a data structure that helps MySQL obtain data efficiently.

2. An index is a way to speed up the retrieval of data in a table. The index of a database is similar to the index of a book. In books, the index allows users to quickly find the information they need without flipping through the whole book. In a database, indexes also allow database programs to quickly find data in tables without having to scan the entire database.

Several basic index types of MySQL database: general index, unique index, primary key index, full-text index

1. The index speeds up the retrieval of the database.

2. The index reduces the speed of maintenance tasks such as insertion, deletion, modification, etc.

3. A unique index ensures the uniqueness of each row of data.

4. By using the index, the optimization hidden device can be used in the process of query to improve the performance of the system.

5. The index needs to occupy physical and data space.

21. What is a transaction?

Transaction is the basic unit of concurrency control. The so-called transaction, it is a sequence of operations, these operations are either performed or not performed, it is an indivisible unit of work. A transaction is the unit in which a database maintains data consistency, which can be maintained at the end of each transaction.

24. What are the optimistic and pessimistic locks of the database?

The task of concurrency control in database management system (DBMS) is to ensure that when multiple transactions access the same data in the database at the same time, the isolation and unity of transactions and the unity of the database will not be destroyed. Optimistic concurrency control (optimistic lock) and pessimistic concurrency control (pessimistic lock) are the main technical means of concurrency control.

Pessimistic locks: assume that concurrency conflicts occur, shielding all operations that may violate data integrity

Optimistic locks: assuming that there are no concurrency conflicts, only check for violations of data integrity when committing operations.

Does the use of index queries necessarily improve the performance of queries? Why

In general, querying data through an index is faster than a full table scan. But we must also pay attention to its cost.

1. The index needs space to store and regular maintenance. Whenever a record is added or decreased in the table or the index column is modified, the index itself will be modified. This means that the INSERT,DELETE,UPDATE of each record will pay for it 4 times more than 5 times of disk Imax O. Because indexes require additional storage space and processing, unnecessary indexes will slow down the query response time. Using index queries does not necessarily improve query performance, and index range queries (INDEX RANGE SCAN) are suitable for two situations:

2. Based on a range of retrieval, the result set returned by a general query is less than 30% of the number of records in the table.

3. Retrieval based on non-unique index

23. Briefly talk about the areas of drop, delete and truncate

Drop, delete and truncate in SQL all indicate deletion, but there are some differences among them.

1. Delete and truncate only delete the data of the table, not the structure of the table.

2. Speed, generally speaking: drop > truncate > delete

3. The delete statement is dml, and this operation will not take effect until the transaction is committed. This operation will be put into rollback segement.

4. If there is a corresponding trigger, it will be triggered during execution. Truncate,drop is ddl, and the operation takes effect immediately. The original data is not put into rollback segment and cannot be rolled back. The operation does not trigger trigger.

24. In what scenarios are drop, delete and truncate used respectively?

1. When you no longer need a table, use drop

2. When you want to delete some rows of data, use delete with the where clause

3. Use truncate when you keep the table and delete all data

25. What are superkeys, candidate keys, primary keys and foreign keys respectively?

1. Superkeys: a set of attributes that uniquely identify tuples in a relationship is called a hyperkey in a relational schema. An attribute can be used as a superkey, or multiple attributes can be combined as a superkey. The superkey contains a candidate key and a primary key.

2. Candidate key: it is the minimum superkey, that is, the superkey without redundant elements.

3. Primary key: a combination of data columns or attributes in a database table that uniquely and fully identify the stored data object. A data column can have only one primary key, and the value of the primary key cannot be missing, that is, it cannot be null (Null).

Foreign key: the primary key of another table that exists in one table is called the foreign key of this table.

What is a view? And what are the usage scenarios of the view?

1. A view is a virtual table with the same function as a physical table. The view can be added, modified, checked, manipulated, and attempted to be a subset of rows or columns that usually have one or more tables. Changes to the view do not affect the underlying table. It makes it easier for us to get data than multi-table queries.

2. Only part of the fields are exposed to visitors, so create a virtual table, that is, the view.

3. The data of the query comes from different tables, and the querier wants to query in a unified way, so that a view can be established to combine the query results of multiple tables. The querier only needs to obtain the data directly from the view. There is no need to consider the differences caused by the data coming from different tables.

27. Talk about three paradigms.

The first normal form (1NF): the fields in the database table are of a single attribute and cannot be further divided. This single attribute consists of basic types, including integer, real, character, logical, date, and so on. The second normal form (2NF): there is no partial functional dependency of non-key fields on any candidate key field in the database table (partial functional dependency refers to the existence of some fields in combined keywords to determine non-key fields), that is, all non-key fields are completely dependent on any set of candidate keywords. The third normal form (3NF): on the basis of the second normal form, it conforms to the third normal form if there is no transfer function dependence of non-key fields on any candidate key fields in the data table. The so-called transfer function dependency means that if there is a determinant relationship of "A → B → C", then the C transfer function depends on A. Therefore, database tables that meet the third normal form should not have the following dependencies: key field → non-key field x → non-key field y

Welcome Java engineers who have worked for one to five years to join Java Advanced Architecture: 705127209

Free Java architecture learning materials are available in the group (high availability, high concurrency, high performance and distribution, Jvm performance tuning, Spring source code)

The structure information of multiple knowledge points such as MyBatis,Netty,Redis,Kafka,Mysql,Zookeeper,Tomcat,Docker,Dubbo,Nginx)

Make good use of every minute of your time to learn to improve yourself, and stop using "no time" to hide your mental laziness! Work hard while you are young and give your future self an account!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.