In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Index, which provides quick access to specific information in database tables. An index is a structure that sorts the values of one or more columns in a database table, such as the name (name) column of the employee table. If you want to find a specific employee by last name, the index helps you get that information faster than if you have to search all the rows in the table.
Advantages of the index:
There is no need to do a full table scan, just scan the index index to store only a small part of the data of the table, which can help us to realize fast query, so we can scan only this small part when scanning. If this small part is loaded into memory, it will be faster.
Greatly reduces the amount of data that the server needs to scan
Indexes can help servers avoid sorting or using temporary tables
An index can convert a random iCue O to a sequential iCandle O
Disadvantages of the index:
The index saves a small part of the data on the data table, so the data needs to be stored extra. There is no doubt that if the data in the table is updated, the index data in response will also be updated, speeding up the search operation. However, it remains to be evaluated whether the reduction of write speed is useful for the acceleration of lookups. For example, if we create an index by age in a table (create an index in terms of age), most operations usually look up by name, then the index has no effect. The so-called index must exactly match the search to make sense, but we need to know that most searches may not be performed only on limited fields, which means that creating an index must contain multiple segments. We need to see how the index is generated, and the index can be looked up as a combined index for multiple conditions, so the design of the index is very skillful.
The index itself may not bring advantages. If there are too many indexes in a table, it may have a great impact on the performance of the whole system. If a table itself is very small and has only more than a dozen rows, the creation of the index will slow down, because it may not take long to scan the whole table.
But if the table is very large, the index is very useful, and if the amount of data is too large, then the index may not be meaningful. For example, a table is very large, and the data on T, you can imagine what kind of index can be created, so you can only cut the large table into small tables and distribute them on different physical nodes, which is called shaerd for mysql; for mongodb, it is called partition.
Index level:
Index highest level 3-star index
1 star: index countries can put relevant records together, greatly reducing the Imax O
2 stars: the storage order of the data in the index is the same as that in the search standard (as long as it is well designed)
3 stars: if the index contains all the data needed in the query (overwrite the index)
Category of index:
Sequential index
Hash index
Map the index to the hash bucket, which is done through the hash function
Evaluation index criteria:
1. Access type (if you do equivalent comparison hash is better, if you do range search, then the order is better.
2. Access duration (to complete an access, the access time may be different based on the index type)
3. Insert time (updating the table, the index itself may be costly, if the hash index is just to re-execute the following algorithm, but for the sequential index, it is possible to move the index data behind the index list)
4. Delete duration
5. Space overhead
Index type:
Sequential indexing: files stored according to a clustered index are also called index order files. The most common type of index is generally indexed file records. If stored sequentially, they are index order files, otherwise they are heap files.
Clustered index: if the order of records in a record file is sorted according to the order of the corresponding search code (key / key), it is called the primary index
Nonclustered index: the order specified in the search code is not consistent with the order of records in the record
Based on whether the index entry is created for each record response in the index:
Dense index (each search code value has a corresponding index entry
Sparse index (not every record has an index entry)
Multi-level index (index points to index, and so on, and the final index points to data
For the index itself, indexes other than the primary index are called secondary indexes, and only the primary index can use sparse indexes, all others must be dense indexes, and the secondary indexes must be dense indexes.
B+ tree index:
Balance Tree balanced tree index
Each leaf node has the same distance from the leaf to the root, so it's called a balanced tree.
Hierarchies need to be created dynamically according to the amount of data.
B + tree is a sequential index.
Hash index:
Through the hash function, the database is loaded once as the Imax O pointer to load the data as twice as the Imax O.
The hash index is faster when doing exact matching, because the number of the index hash O is much less, so the hash index allows us to avoid accessing the index structure.
Disadvantages of hash index: hash index may also cause skew. For a long time, some hash buckets are full and empty, resulting in uneven load of each node. If the hash function is not random enough, it may cause skew.
So the hash function needs to do the following:
Distributed random
Distributed and uniform
Scenarios in which hash functions are applicable: exact value matching, such as equivalent comparison: =, IN (), etc.
Full-text index:
By default, a sequential index can only have a limited number of bytes in the first part of the index field. If the field name is test,test, you can create a large amount of storage text. It is impossible to store all the data in the index, so you must only extract some bytes from it, so the search standard must be the leftmost prefix and cannot contain the whole field. If you want to achieve full-text matching keyword matching, In this way, full-text indexing can only be used (only myisam engine supports it in mysql) (innodb can be implemented with the help of external indexing tools such as sphinx)
If full-text indexing is necessary, using sphinx is a good choice.
Spatial index:
The data in the index cannot be found, and the spatial index function must be used to get the corresponding search results.
Properties of the index:
Full value match:
To put it simply, match his user name: Name= "User12", match the leftmost prefix:
Name LIKE "User1%"
Invalid: Name LIKE "% User1%"
Match column prefix: same as the leftmost prefix (Name LIKE "User1%" invalid: Name LIKE "% User1%") if the composite index creates two fields: Name,Age is valid from the far left, then Age > 80 does not make any sense, because the search condition must start from the far left, but the reverse is very useful: (Age,Name)
Match range value: exactly match one column and range match another column such as name=12 and age greater than 80
Queries that access only the index:
Assuming that the sequential index is level 3, to find the corresponding row data, if you do not use an overlay index, you need to find the root index several times. Then look for the next-level index. If the next-level index is on disk, it means that the data block will be loaded, which is 1 IO, and the index will consume another IO. If the hard disk data is loaded again, it will take at least 4 times to find the data if the root index is not loaded in advance.
The primary key and the unique key are all sequential indexes, but the only difference is that the primary key cannot be repeated and cannot be empty, and the only key can be repeated and can be empty.
Create an index:
> db.testcoll.find ()
{"_ id": ObjectId ("531fbe8d020f14309ee1410a"), "Name": "User1", "Age": 1, "Gender": "M", "preferbook": ["blue book", "yellow book"]}
{"_ id": ObjectId ("531fbe8d020f14309ee1410b"), "Name": "User2", "Age": 2, "Gender": "M", "preferbook": ["blue book", "yellow book"]}
{"_ id": ObjectId ("531fbe8d020f14309ee1410c"), "Name": "User3", "Age": 3, "Gender": "M", "preferbook": ["blue book", "yellow book"]}
As shown above, we want to create an index on the user field name. Note that the field id is the index by default, and it is the primary key index. Creating an index outside the primary key index is called a secondary index. Because most of the indexes in the table are looked up by the user name, we want to find the index by the user name:
Use the command ensureIndex to create an index on the Name field
> db.testcoll.ensureIndex ({Name:1})
View the index:
> db.testcoll.getIndexes ()
[
{
"v": 1
"key": {
"_ id": 1
}
"ns": "testdb.testcoll"
"name": "_ id_"
}
# the second index is created on name and specified by ourselves, as follows:
{
"v": 1
"key": {
"Name": 1
}
"ns": "testdb.testcoll"
"name": "Name_1"
}
Delete the index:
You can use dropIndex to delete the index of the name field
> db.testcoll.dropIndex ({Name:1})
{"nIndexesWas": 2, "ok": 1}
View its index again
> db.testcoll.getIndexes ()
[
{
"v": 1
"key": {
"_ id": 1
}
"ns": "testdb.testcoll"
"name": "_ id_"
}
]
Delete all indexes of coll
> db.testcoll.dropIndex ({Name:1})
It also supports the use of unique indexes. We can create a unique index on the name field, which means that the user name must not have a duplicate name.
# unique Index
> db.testcoll.ensureIndex ({Name:1}, {unique:true})
# sparse index
> db.testcoll.ensureIndex ({Name:1}, {sparse:true})
Index types supported in MongoDB
For mongodb, the index can be created at the collection level or in the () child collection in sub-field
Can be created according to your own needs, then the index can convert random IO to sequential IO
Index type:
1. Single-key index (an index created on a field)
2. Composite index (mentioned above)
3. Multi-key index (the value of a field in a document can be an array. If created on such a field, if there are multiple values on a field, it is a multi-key index (one value is an array).
4. Spatial index (you can only use the spatial index function, which is consistent with mysql)
5. Text index (full-text index)
6. Hash indexing
To create a hash index, you must specify the format of the hash, as shown below:
> db.testcoll.ensureIndex ({Name: "hashed"})
> db.testcoll.dropIndex ({Name: "hashed"})
Shows whether the index can be used:
Shows whether the query statement can actually use the index created:
> db.testcoll.find ({Name: "User19"}) .explain ()
{
"cursor": "BtreeCursor Name_1"
"isMultiKey": false, # whether keys are used
"n": 1
"nscannedObjects": 1
"nscanned": 1, # how many records have been scanned
"nscannedObjectsAllPlans": 1
"nscannedAllPlans": 1
"scanAndOrder": false, # is there any record after scanning?
"indexOnly": false, # whether the index is used or only in the index
"nYields": 0
"nChunkSkips": 0
"millis": 0
"indexBounds": {
"Name": [
[
"User19"
"User19"
]
]
}
"server": "localhost:27017"
}
Delete the index:
> db.testcoll.find ({Name: "User19"}) .explain ()
{
"cursor": "BasicCursor"
"isMultiKey": false
"n": 1
"nscannedObjects": 99, # scan all objects, meaning full table scan
"nscanned": 99
"nscannedObjectsAllPlans": 99
"nscannedAllPlans": 99
"scanAndOrder": false
"indexOnly": false
"nYields": 0
"nChunkSkips": 0
"millis": 0
"indexBounds": {
}
"server": "localhost:27017"
}
You can use hint to specify the index to use when querying
> db.testcoll.find ({Name: "User19"}) .hint ({Name:1}) .explain ()
Create a composite index
> db.testcoll.ensureIndex ({Name:1,Age:1}, {uniqe:true})
> db.testcoll.getIndexes ()
[
{
"v": 1
"key": {
"_ id": 1
}
"ns": "testdb.testcoll"
"name": "_ id_"
}
{
"v": 1
"key": {
"Name": 1
"Age": 1
}
"ns": "testdb.testcoll"
"name": "Name_1_Age_1"
"uniqe": true
}
]
If not specified, the index is found on the name:1. As follows:
> db.testcoll.find ({Name: "User19"}) .explain ()
{
"cursor": "BtreeCursor Name_1_Age_1"
"isMultiKey": false
"n": 1
"nscannedObjects": 1
"nscanned": 1
"nscannedObjectsAllPlans": 1
"nscannedAllPlans": 1
"scanAndOrder": false
"indexOnly": false
"nYields": 0
"nChunkSkips": 0
"millis": 0
"indexBounds": {
"Name": [
[
"User19"
"User19"
]
]
"Age": [
[
{
"$minElement": 1
}
{
"$maxElement": 1
}
]
]
}
"server": "localhost:27017"
}
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.