In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
1. Common bottlenecks
1) SQL is inefficient
(2) Improper configuration of options
(3) Interview questions soar
(4) Low hardware performance
(5) Other processes compete for resources
2. How to confirm that MySQL has a bottleneck
top/free/vmstat/sar/mpstat Confirm
- Confirm CPU consumption ratio of mysqld process
- Confirm if CPU consumption of mysqld process is %user or %sys high
- Confirm if physical memory is insufficient
- Confirm if swap occurs
- Confirm if there are a lot of interrupts (or uneven interrupts) on the CPU
2.1 View CPU-top
2.2 View Memory-Free
free -m
Free related commands
2.3, View IO, CPU, Memory, Swap Partition, Interrupt-vmstat
vmstat -S m 1
As can be seen from the above, CPU and I/O pressure is not small.
Output Description:
Interpretation of the vmstat command:
1)procs
r This column shows how many processes are waiting on the CPU
Column b shows how many processes are uninterruptedly dormant (usually meaning they are waiting for I/O, such as disk, network, user input, and so on).
2)memory
swpd The size of virtual memory used (showing how many blocks have been swapped out to disk (page swap)). If it is greater than 0, it means that your machine is running low on physical memory. If it is not the cause of the program memory leak, then you should upgrade the memory or migrate the memory-consuming task to another machine.
free Size of free physical memory
buff How many blocks are being used as buffers
cache How much is being used as a cache for the operating system
swap shows page swap activity: how many blocks are being swapped in (from disk) and out (to disk) per second
si reads the size of virtual memory from disk per second. If this value is greater than 0, it means that physical memory is not enough or memory leaks. It is necessary to find a memory-consuming process to solve it.
so The size of virtual memory written to disk per second. If this value is greater than 0, the same as above.
In general, the values of si and so are 0. If the values of si and so are not 0 for a long time, it means that the system memory is insufficient, and it is necessary to consider whether to increase the system memory.
4) IO shows how many blocks are read (bi) and written (bo) from the block device
bi The number of blocks received per second by a block device. Block devices here refer to all disks and other block devices on the system. The default block size is 1024 bytes. I don't have any IO operations on my own machine, so it's always 0. However, I've seen it on machines that handle large amounts of copied data (2-3T). The disk write speed is almost 140M per second.
bo The number of blocks a device sends per second, e.g. we read a file, bo must be greater than 0. bi and bo should be close to 0, otherwise IO is too frequent and needs to be adjusted.
The bi+bo reference value set here is 1000. If it exceeds 1000 and the wa value is relatively large, it indicates the system disk IO performance bottleneck.
5) system shows the number of interrupts (in) and context switches (cs) per second
in Number of CPU interrupts per second, including time interrupts
cs Number of context switches per second, for example, we call system functions, context switches, thread switches, process context switches, this value should be as small as possible, too large, to consider reducing the number of threads or processes. System calls are also, every time we call system functions, our code will enter the kernel space, resulting in context switching, this is very resource-intensive, but also try to avoid frequent calls to system functions. Too many context switches means that most of your CPU is wasted on context switches, resulting in less time for CPU to do serious work, CPU is not fully utilized, it is not desirable.
The higher these two values are, the more CPU time you see consumed by the kernel.
6)CPU
us user CPU time. When the value of us is relatively high, it means that the user process consumes more CPU time, but if it is used for more than 50% for a long time, then we should consider optimizing the program algorithm or other measures.
sy system CPU time, if too high, means that the system call time is long, such as IO operations frequently.
If sys is too high, it means that the system kernel consumes too many cpu resources. This is not a benign performance. We should check the reason.
id idle CPU time, in general, id + us + sy = 100, generally I think id is idle CPU usage, us is user CPU usage, sy is system CPU usage.
wa Wait IO CPU time.
When Wa is too high, it indicates that io wait is serious, which may be caused by a large number of random accesses to the disk, or it may be a bottleneck in the bandwidth of the disk.
st column is generally not concerned, percentage of time virtual machines consume
2.4 View CPU and IO-sar
view CPU
sar -u 1
Output Item Description:
CPU: all indicates that the statistics are the average of all CPUs.
%user: Displays the percentage of total CPU time running at the user level (application).
%nice: Percentage of CPU time consumed in user mode by processes whose scheduling priorities were changed by nice
%system: Percentage of total CPU time spent running at kernel level.
%iowait: Displays the percentage of total CPU time spent waiting for I/O operations.
%steal: Percentage of hypervisor waiting for virtual CPU to service another virtual process.
%idle: Displays the percentage of CPU idle time consumed by the total CPU time.
1. If %iowait is too high, the hard disk has an I/O bottleneck
2. If the %idle value is high but the system response is slow, it is possible that the CPU is waiting to allocate memory, and the memory capacity should be increased.
3. If the %idle value is consistently lower than 1, the CPU processing power of the system is relatively low, indicating that the CPU is the resource that needs to be solved most in the system.
View IO status
tps: Number of I/Os per second from physical disks. Multiple logical requests are combined into a single I/O disk request, the size of a single transfer is indeterminate.
rd_sec/s: Number of sector reads per second.
wr_sec/s: Number of sectors written per second.
avgrq-sz: Average data size (sectors) per device I/O operation.
avgqu-sz: Average length of disk request queue.
await: Average elapsed time per request from request to disk operation to system completion, including queue wait time, in milliseconds (1 second =1000 milliseconds).
svctm: The average time the system takes to process each request, excluding time spent in the request queue.
%util:I/O requests as a percentage of CPU, the higher the ratio, the more saturated.
1. Low avgqu-sz values result in higher utilization of equipment.
2. When the %util value is close to 1%, the device bandwidth is full.
2.5 View interruptions
mpstat -P ALL -I SUM 1 100
3.1 Show which threads are running
show processlist; or show full processlist
Status 1: Sending data
mysql> show processlist\G
As can be seen from the above, it is a long-term sending data
Sending data: Indicates the status of data read from the engine layer and returned to the Server side
Reasons for long-term existence:
(1) No proper index, low query efficiency
(2) Read a lot of data, read slowly
(3) High system load, slow reading
Solution:
(1) Add appropriate index
(2) Or rewrite SQL to improve efficiency
(3) Increase LIMIT to limit the amount of data read at a time
(4) Check & upgrade I/O device performance
Waiting for table metadata lock
show processlist; or show full processlist
As can be seen from the above: long wait for MDL lock
Reason:
(1) DDL is blocked, which in turn blocks his subsequent SQL
(2) SQL before DDL is not finished for a long time
Solution:
(1) Improve the efficiency of each SQL
(2) Kill long-running SQL
(3) Place DDL at low points such as midnight
(4) Execute DDL with pt-osc
State 3: Sleep
As can be seen from the above: Sleep
It looks harmless, but it could be a big pest.
(1) Number of connections occupied
(2) Memory consumption is not released
(3) There may be row locks (or even table locks) that are not released
Solution:
(1) Reduce timeout appropriately
(2) Active Kill timeout inactive connection
(3) Regularly check lock and lock waiting
(4) The pt-kill tool can be used
State 4: Other States
(1)Status: Copy to tmp table
Reason:
1) Executing alter table to modify table structure, temporary table needs to be generated
2) It is recommended to perform at night trough, or use pt-osc
(2)Copying to tmp table [on disk]
Creating tmp table
Common in cases where group by has no index
Need to copy data to temporary table [memory/disk]
The Using temporary keyword appears in the execution plan
It is recommended to create appropriate indexes and eliminate temporary tables
(3) Creating sort index
Common in case of order by without index
Filesort sorting is required
The Using filesort keyword appears in the execution plan
It is recommended that you create a sort index
(4) Other states
Waiting for global read lock
Waiting for query cache lock
Waiting for table level lock
Waiting for table metadata lock
3.2 Check lock- mysql lock troubleshooting process
mysql> select * from information_schema.innodb_trx;
mysql> select * from information_schema.innodb_locks;
View lock waiting
mysql> select * from information_schema.innodb_lock_waits;
mysql> select * from sys.innodb_lock_waits;
mysql lock checking process
1) Check the current lock waiting situation
INNODB_TRX lock situation:
mysql> SELECT * FROM INNODB_TRX\G;
2) Look at the correlation between lock waiting and lock holding
mysql> SELECT * FROM INNODB_LOCK_WAITS\G;
3) See why the lock is waiting
mysql> SELECT * FROM INNODB_LOCKS\G;
3.3 Check the status of Innodb.
show engine innodb status\G
View MySQL thread status
3.4 View slow logs 4. How to prevent 4.1. Before business goes online
(1) Eliminate garbage SQL in advance,
(2) In development or pressure testing environments
Lower the value of long_query_time, even to 0
Open log_queries_not_using_indexes
Analyze slow query logs and eliminate potential SQL hazards
4.2 with better facilities.
(1) CPU is faster and more cores
(2) Memory is faster and larger
(3) Use faster I/O devices
(4) Use better network equipment
4.3 Disk File System and Scheduling Algorithm
(1)xfs/ext4 file system
(2) Use noop/deading io scheduler
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.