In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the relevant knowledge of the database, hoping to supplement and update some knowledge, if you have other questions to understand, you can continue to follow my updated article in the industry information.
Hard disk
In the second chapter, "Database system implementation" uses a separate chapter to explain the principle of disk storage. This is because the only one with persistent storage capacity in the built-in components of the computer is the hard disk, and the software succumbs to the hardware. Therefore, we can understand the logic behind the software design by understanding the storage characteristics of the disk. Disk storage has the following characteristics:
Feature A: disk latency is very large compared to CPU latency. In "Top of performance", there is a comparison: for 3.3GHz 's CPU, an instruction cycle is 0.3 ns; for a mechanical hard disk, the delay of I _ Band O is 1~10ms. How big is this gap? if a CPU instruction cycle is 1s, then the delay of a mechanical hard disk is 1mm 12 months at a time. Just wait until the flowers are gone.
Feature B: the disk is a block device and each write is block-by-block. Usually a block is 512byte. Even if you use a hard drive, be careful not to ship only one potato to the United States.
Feature C: the performance of sequential IO is much higher than that of random IO. Because sequential IO avoids seek time and rotation delay.
The above features affect not only the design of the database, but also the design of the operating system, such as page cache.
Read and write
Database operations are basically higher-level reads and writes: select and delete/update are the operations we use the database most frequently. Therefore, the core problem solved by the database is how to organize data to achieve high-performance reading and writing.
# transaction
Concurrency cannot be ignored in high performance, and the problem of data consistency will arise in concurrent read and write scenarios. So transactions are used to solve the problem of data consistency.
By default, each SQL statement is a transaction, and you can manually set the commit point to change this rule.
In MySQL, there are four isolation levels for transactions. These four can be deduced from application scenarios without having to memorize them.
Unsubmitted read
Transaction A modifies record an and does not commit; transaction B reads the table and reads the transaction. This is an unsubmitted reading. If we design our own database and modify the fields on the original data, this will happen in concurrency if there is no other means of control. As a result of reading dirty data, also known as dirty reading. Here we can also change the transaction to a familiar concept: threads to understand.
Submit for reading
In view of the uncommitted reading above, if the changes are saved within the scope of the transaction, the uncommitted data will not affect other transactions. This isolation level is submitted for reading. Also known as unrepeatable. Because of the error, the two internal executions may get different results.
Repeatable read
The problem of non-repeatable readings faced by committed reads can be avoided at the repeatable read isolation level. It ensures that a transaction reads the same record multiple times and will not change. Of course, if the record is changed within the transaction, say otherwise. Another problem with this level is illusion. This is easy to understand: two transactions. Transaction A reads the record when it does not exist; transaction B writes the record. It is possible for transaction A to have a write failure.
Serializable
Execute transactions sequentially. The lowest performance approach.
Indexes
There are usually two typical scenarios for querying data: equivalent query and interval query. That is select * from table where field=an or select * from table where field between an and b. If there is no index, the only thing to do is a full table scan. This is a practice of looking for a needle in a haystack. Programmers generally focus on two points: where the problem is and how to optimize it. For equivalent queries, the best way to optimize is hash. For interval queries, it is not an opportunity for the hash algorithm to make full use of its ability, because it has a hidden logic: sorting. In general, the data structures with sorting function are: sorted array, linked list. Jump watch. AVL tree, red and black tree, B tree, B + tree.
Why choose B+ tree?
The hard drive is a block device. Using the B + tree, you can control the level of the B tree to no more than 3 layers by accommodating N elements in the same block. Reduce the number of IO. The leaf node of the B+ tree is a linked list. Facilitate the operation of disk sequential IO. Performance
Performance is response time. The way to troubleshoot performance problems is the top-down way:
CPU, memory, network, IO, is the hard drive OK? No, there are only three cases of OK: insufficient margin or unreasonable configuration or failure. Index optimization, table structure optimization and query optimization go hand in hand.
Performance issues require more knowledge about the operating system.
Copy
This is the killer mace of MySQL. Without replication, MySQL could not have been so popular. Replication derived features: read-write separation, load balancing, high availability, failover, backup, test upgrade. It's all the concept of street abuse.
The implementation of replication: row-based replication and log-based replication.
Expandability
Scalability is the ability to improve the capacity of the system by increasing resources. For example, MySQL uses read-write separation. With each new Slave node added, the read concurrency capability at the database level is improved. Of course, due to the layering of the system, each level supports scalability, and the whole system has scalability.
Common strategies for database-level expansion are sub-database and sub-table strategies.
In addition, scalability and system performance are two different things. For example, Hive has low performance, but does not affect its scalability.
High availability
High availability is essentially less downtime. If you look at people from the perspective of computers, the availability of people is generally only 50%, and according to the 8-hour work day, the availability of people is only 33%.
Using the idea of top-down, there are only two ways to achieve high availability: to increase the average failure time interval, in short, to make the interval between two outages of the system as long as possible, and to reduce the average recovery time, in short, the shorter the fault repair time, the better. Therefore, the high availability is more about the architecture level, such as MySQL as the master / slave and load balancing.
Backup
Backup is a topic that is easy to be ignored. It belongs to the kind of people who are unpopular at ordinary times and need to carry thunder at critical moments. Backup has three basic functions: disaster recovery, audit, and testing.
After reading the above related knowledge about the database, I hope it can bring some help to everyone in practical application. Due to the limited space in this article, it is inevitable that there will be deficiencies and need to be supplemented. If you need more professional answers, you can contact us on the official website for 24-hour pre-sales and after-sales to help you answer questions at any time.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.