In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Ref: http://www.javaeye.com/topic/144033
Author: andyao
Original link: http://andyao.javaeye.com/admin/show/144033
Reprint, please leave your name.
1. Brief introduction
In the Web application architecture, the data persistence layer (usually a relational database) is the key core part, which has a very important impact on the performance of the system. MySQL is currently the most widely used open source database, but the default performance of the MySQL database is very poor, just a toy database. Therefore, the use of MySQL database in the product must be optimized as necessary.
Optimization is a complex task. This paper describes MySQL-related database design and query optimization, server-side optimization, storage engine optimization.
two。 Database design and query optimization
In MySQL Server performance tuning, the first consideration is Database Schema design, which is very important. A poor Schema design will show poor performance even if it runs on a performance-tuned MySQL Server; like Schema, the design of query statements can affect the performance of MySQL, and you should avoid writing inefficient SQL queries. This section will discuss the optimization of these two aspects in detail.
2.1 Schema Design
The optimization of Schema depends on what kind of query will be run, and different query will have different Schema optimization schemes. Section 2.2 describes the optimization of Query Design. Schema design is also affected by the expected dataset size. The main considerations of Schema design: standardization, data type, index.
2.1.1 Standardization
Standardization is the process of organizing data in a database. These include creating tables based on design rules and establishing relationships between them, which can protect data and improve data flexibility at the same time by eliminating redundancy and inconsistent dependencies. Usually database standardization is to make the database design conform to a certain level of paradigm, usually to meet the third paradigm. There are also the fourth normal form (also known as Boyce Codd normal form, BCNF) and the fifth normal form, but it is rarely considered in the actual design. Ignoring these rules may make the design of the database imperfect, but this should not affect functionality.
Characteristics of standardization:
1) all "objects" are in its own table and have no redundancy.
2) the database is usually generated by Emurr graph.
3) succinct, updating attributes usually requires only a few records to be updated.
4) Join operation is time-consuming.
5) there are few measures to optimize Select,sort.
6) suitable for OLTP applications.
Non-standardized features:
1) storing a lot of data in a table, the data is redundant.
2) updating data is very expensive, updating an attribute may update many tables and records.
3) it is possible to lose data when deleting data.
4) Select,order has many optimization options.
5) suitable for DSS applications.
Both standardized and non-standardized have their own advantages and disadvantages, usually they can be mixed in a database design, some tables are standardized, and some tables retain some redundant data:
1) use standardization for OLTP and non-standardization for DSS
2) use materialized views. MySQL does not directly support this database feature, but you can use MyISAM tables instead.
3) redundant data in the table, such as storing ref_id and name in the same table. But pay attention to the update problem.
4) for some simple objects, use value directly as a build. Such as IP address, etc.
5) Reference by PRIMARY/UNIQUE KEY. MySQL can optimize this operation, such as:
Java code select city_namefrom city,statewhere state_id=state.id and state.code='CA' "converted to" select city_namefrom city where state_id=12
2.1.2 data types
One of the most basic optimizations is to make the table occupy as little space on disk as possible. This can lead to a significant performance improvement because the data is small, the disk reads faster, and the table contents are processed with less memory during the query process. At the same time, indexing on smaller columns takes up less resources.
You can use the following techniques to improve table performance and minimize storage space:
1) use the correct and appropriate type, and do not store numbers as strings.
2) use the most efficient (smallest) data types as possible. MySQL has many specialization types that save disk space and memory.
3) use smaller integer types to make the table smaller as much as possible. For example, MEDIUMINT is often better than INT because the MEDIUMINT column uses 25% less space.
4) if possible, declare as NOT NULL. It makes everything faster and saves one person per column. Note that if you do need NULL in your application, you should no doubt use it, but avoid having it on all columns by default.
5) for the MyISAM table, if you do not have any variable length columns (VARCHAR, TEXT, or BLOB columns), use a fixed size record format. This is faster, but unfortunately it may waste some space. Even if you have asked VARCHARcolumn row _ FORMAT=fixed with the CREATE option, you can also be prompted to use fixed-length lines.
6) use sample character set, such as latin1. Use utf-8 as little as possible, because utf-8 takes up three times as much space as latin1. You can use latin1 on fields that do not need to use utf-8, such as mail,url, and so on.
2.1.3 Index
All MySQL column types can be indexed. Using indexes on related columns is the best way to improve the performance of SELECT operations. You should pay attention to the following points when using an index:
1) MySQL will only use prefixes, such as key (a, b)... Where bread5 will not use the index.
2) use the index selectively. It is not good to use indexes on columns that have little change, such as gender columns.
3) define Unique index on the Unique column.
4) avoid building indexes that are not used.
5) in Btree index (InnoDB uses Btree), you can index the columns that need to be sorted.
6) avoid duplicate indexes.
7) avoid building an index on the prefix of an existing index. For example, remove index (a) if index (a) exists.
8) controls the length of a single index. Use key (name (8)) to index the first few characters of the data.
9) the shorter the key value, the better, it is best to use integer.
10) to use the index in the query (using explain to view), you can reduce the number of times to read the disk and speed up the reading of data.
11) similar bond values are better than random ones. Auto_increment is better than uuid.
12) Optimize table can compress and sort index, so be careful not to run it frequently.
13) Analyze table can update the data.
2.2 Designing queries
Query optimization is a Case by case problem, different sql has different optimization solutions, here I only list some general techniques.
1) in the case of index, try to make sure that the query uses the correct index. You can use EXPLAIN select... View the results and analyze the query.
2) use matching types when querying. For example, select * from a where id=5, if id is a character type and has index, this query will not use index and will do a full table scan, which will be very slow. The right thing should be... Where id= "5", plus quotation marks, indicates that the type is a character.
3) use-- log-slow-queries-long-query-time=2 to view statements with slow queries. Then use explain to analyze the query and optimize it.
3. Server-side Optimization 3.1 MySQL installation
There are many distributions of MySQL, and it is best to use the binary version released by MySQL AB. You can also download the source code for compilation and installation, but some bug of compilers and class libraries may make the compiled MySQL potentially problematic.
If the server where MySQL is installed is using an Intel processor, you can use the version compiled by intel C++. It is mentioned in a PPT article in Linux World2005 that MySQL queries compiled with the intel C++ compiler are about 30% faster than the normal version. The compiled version of Intel C++ can be downloaded from the official website of MySQL.
3.2 Optimization of server settings
The default settings of MySQL perform poorly, so you need to adjust some parameters. This section introduces some general parameter tuning and does not involve specific storage engines (mainly MyISAM,InnoDB, related optimizations are described in 4).
-- character-set: if it is a single language, use a simple character set such as latin1. Use as little Utf-8,utf-8 as possible and take up more space.
-- memlock: locking MySQL can only run in memory, avoiding swapping, but errors may occur if there is not enough memory.
-- max_allowed_packet: be large enough to accommodate large SQL queries with little impact on performance, mainly to avoid packet errors.
-- the maximum connection allowed by max_connections:server. If it is too big, there will be out of memory.
-the number of table that table_cache:MySQL keeps open at the same time. It is expensive to open table. It is generally set to 512.
-- query_cache_size: the amount of memory used to cache queries.
-- the root directory where datadir:mysql stores data and separate installation files on different disks can improve performance a little bit.
4. Storage engine optimization
MySQL supports different storage engines, mainly MyISAM and InnoDB.
4.1 MyISAM
MyISAM manages non-transactional tables. It provides high-speed storage and retrieval, as well as full-text search capabilities. MyISAM is supported in all MySQL configurations and is the default storage engine unless MySQL is configured to use another engine by default.
4.1.1 MyISAM featur
4.1.1.1 MyISAM Properties
1) do not support transactions, the opportunity to destroy the table
2) use less memory and disk space
3) based on table locks, there will be serious performance problems when updating data concurrently.
4) MySQL caches only Index, and data is cached by OS
4.1.1.2 Typical MyISAM usages
1) Log system
2) applications that are read-only or mostly read-only
3) full table scan
4) batch import data
5) low concurrency read / write without transaction
4.1.2 key points of MyISAM optimization
1) declare it as NOT NULL, which can reduce disk storage.
2) use optimize table to defragment and reclaim free space. Note that it only runs after a very large data change.
3) the use of index is prohibited when Deleting/updating/adding has a large amount of data. Use ALTER TABLE t DISABLE KEYS.
4) setting myisam_max_ [requests] _ sort_file_size is large enough to significantly improve the speed of repair table.
4.1.3 MyISAM Table Locks
1) avoid concurrent insert,update.
2) insert delayed can be used, but it is possible to lose data.
3) optimize the query statement.
4) horizontal zoning.
5) Vertical partition.
6) if nothing works, use InnoDB.
4.1.4 MyISAM Key Cache
1) set key_buffer_size variable. The primary cache setting of MyISAN, which is used to cache index data for MyISAM tables, and this parameter only affects MyISAM. Typically set a memory size of 25-33% in a Server that uses only MyISAM.
2) you can use several different Key Caches (for some hot data).
A) SET GLOBAL test.key_buffer_size=512*1024
B) CACHE INDEX t1.i1, t2.i1, T3 IN test;2) Preload index to Cache can improve query speed. Because preloading index is sequential, it is very fast.
A) LOAD INDEX INTO CACHE T1, T2 IGNORE LEAVES
InnoDBInnoDB provides MySQL with a transaction secure (ACID compatible) storage engine with commit, rollback and crash recovery capabilities. InnoDB provides row level lock, and also provides an Oracle-consistent unlocked read in the SELECT statement. These features increase multi-user deployment and performance. There is no need to expand locking in InnoDB, because row level lock is suitable for very small space in InnoDB. InnoDB also supports FOREIGN KEY constraints. In SQL queries, you are free to mix InnoDB tables with other MySQL table types, even in the same query.
InnoDB is designed for maximum performance when dealing with large amounts of data. Its CPU usage efficiency is very high.
The InnoDB storage engine is fully integrated with the MySQL server, and the InnoDB storage engine maintains its own buffer pool to cache data and indexes in memory. InnoDB stores its tables & indexes in one tablespace, which can contain several files (or raw disk partitions). This is different from the MyISAM table, such as where each table in the MyISAM table is stored in a separate file. The InnoDB table can be of any size, even on operating systems where the file size is limited to 2GB.
The InnoDB engine is used on many large database sites that require high performance. The famous Internet news site Slashdot.org runs on InnoDB. Mytrix, Inc. More than 1TB data is stored on InnoDB, and there are other sites that handle an average of 800 inserts / updates per second on InnoDB.
4.2.1 InnoDB featur
4.2.1.1 InnoDB Properties
1) support transactions, ACID, and foreign keys.
2) Row level locks.
3) different isolation levels are supported.
4) more memory and disk space are required than MyISAM.
5) there is no key compression.
6) data and indexes are cached in the in-memory hash table.
4.2.1.2 InnoDB Good For
1) applications that require transactions.
2) applications with high concurrency.
3) automatic recovery.
4) faster primary key-based operations.
4.2.2 key points of InnoDB optimization
1) try to use the primary key of short,integer.
2) Load/Insert data in primary key order. If the data is not sorted by the primary key, sort it first and then perform database operations.
3) when setting SET UNIQUE_CHECKS=0,SET FOREIGN_KEY_CHECKS=0 for Load data, the overhead of foreign key and uniqueness constraint checking can be avoided.
4) use prefix keys. Because InnoDB does not have key compression.
4.2.3 InnoDB server-side settings
Innodb_buffer_pool_size: this is the most important setting for InnoDB and has a decisive impact on InnoDB performance. The default setting is only 8m, so InnoDB performance is poor under the default database settings. On a database server with only the InnoDB storage engine, you can set up 60-80% memory. To be more precise, set the memory size by 10% larger than InnoDB tablespaces if the memory capacity allows.
Innodb_data_file_path: specifies the space for table data and index storage, which can be one or more files. The last data file must be automatically expanded, and only the last file allows automatic expansion. In this way, when the space is used up, the auto-extension data file will automatically grow (in 8MB) to accommodate the additional data. For example: innodb_data_file_path=/disk1/ibdata1:900M;/disk2/ibdata2:50M:autoextend two data files on different disks. The data is first placed in ibdata1, and when it reaches 900m, the data is placed in ibdata2. Once 50 MB of 8MB is reached, ibdata 2 will automatically grow in units. If the disk is full, you need to add a data file to the other disk. Innodb_autoextend_increment: the default is 8 M, if the amount of data in insert is large, it can be increased appropriately.
Innodb_data_home_dir: the directory where tablespace data is placed, by default in the mysql data directory, and setting to a different partition from the MySQL installation file can improve performance.
Innodb_log_file_size: this parameter determines the recovery speed. If the recovery is too large, it will be slow, and if it is too small, it will affect the query performance. Generally, 256m can strike a balance between performance and recovery speed.
.
Innodb_log_buffer_size: disk speed is very slow. Writing log directly to disk will affect the performance of InnoDB. This parameter sets the size of log buffer, usually 4m. If you have a large blob operation, you can increase it appropriately.
Innodb_flush_logs_at_trx_commit=2: this parameter sets the processing of log information in memory when the transaction is committed.
1) = 1, when each transaction commits, the log buffer is written to the log file, and the log file is refreshed on disk. Truly ACID . Slow.
2) at 2, when each transaction commits, the log buffer is written to the file, but the log file is not refreshed to disk. Only when the operating system crashes or power goes down will the last second transaction be deleted, otherwise the transaction will not be lost.
3) when 0, the log buffer is written to the log file once a second, and the log file is refreshed on disk. The crash of any mysqld process deletes the last second transaction innodb_file_per_table before the crash: each InnoDB table and its index can be stored in its own file.
Transaction-isolation=READ-COMITTED: if the application can run at the READ-COMMITED isolation level, making this setting will result in a performance improvement.
Innodb_flush_method: sets the way in which InnoDB synchronizes IO:
1) Default-use fsync ().
2) O_SYNC opens files in sync mode, which is usually slow.
3) O_DIRECT, using Direct IO on Linux. Can significantly improve speed, especially on RAID systems. Avoid additional data replication and double buffering (mysql buffering and OS buffering). Innodb_thread_concurrency: the maximum number of threads in InnoDB kernel.
1) set to at least (num_disks+num_cpus) * 2.
2) this restriction can be prohibited by setting it to 1000
5. Caching
There are many kinds of caching, and adding appropriate caching strategies to the application can significantly improve the performance of the application. As the application of caching is a relatively large topic, so this part needs further research.
6. Reference
1) http://www.mysqlperformanceblog.com/
2) Advanced MySQL Performance Optimization, Peter Zaitsev, Tobias Asplund, MySQL Users Conference 2005
3) Improving MySQL Server Performance with Intel C++ Compiler,Peter Zaitsev,Linux World 2005
4) MySQL Performance Optimization, Peter Zaitsev, Percona Ltd, OPEN SOURCE DATABASE CONFERENCE 2006
5) MySQL Server Settings Tuning, Peter Zaitsev, co-founder, Percona Ltd, 2007
6) MySQL Reference Manual
[@ more@]
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.