Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the structure of HBase ROOT and META tables

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what is the structure of HBase ROOT and META". In daily operation, I believe many people have doubts about the structure of HBase ROOT and META. The editor consulted all kinds of data and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the questions of "how is the structure of HBase ROOT and META?" Next, please follow the editor to study!

On this basis, we introduce two special concepts:-ROOT- and .meta. What's this? They are two built-in tables of HBase. From the point of view of storage structure and operation method, they are no different from other HBase tables. You can think of them as two ordinary tables, and the operation for ordinary tables is applicable to both of them. What makes them unique is that HBase uses them to store an important system information-the distribution of Region and the details of each Region.

Well, since we talked about-ROOT- and .meta. They can be thought of as two ordinary tables, so they should have their own table structure, just like other tables. Yes, they have their own table structure, and the table structure of the two tables is the same. After analyzing the source code, I roughly draw the table structure:

-ROOT- and .meta. Table structure

Let's take a closer look at this structure, and each Row records the information of a Region.

First, RowKey,RowKey consists of three parts: TableName, StartKey and TimeStamp. The content stored by RowKey is also called Region's Name. Oh, remember? As we mentioned in the previous article, the name of the folder used to hold the Region is the RegionName hash value, because the RegionName may contain some illegal characters. Now you know why RegionName contains illegal characters, because StartKey is allowed to contain any value. The entire RowKey is formed by concatenating the three parts of the RowKey with commas, where the TimeStamp is represented by a decimal numeric string. Here is an example of RowKey:

Java code

Table1,RK10000,12345678

Then the main Family:info,info in the table contains three Column:regioninfo, server, and serverstartcode. Regioninfo is the details of Region, including StartKey, EndKey, and information about each Family, and so on. Server stores the address of the RegionServer that manages the Region.

So when the Region is split, merged, or reassigned, the contents of the table need to be modified.

So far we have learned the necessary background knowledge, and now we will formally introduce the whole process of finding RegionServer on the Client side. I'm going to use a hypothetical example to learn this process, so I first built the hypothetical-ROOT- table and .meta. Watch.

Let's take a look at. META. Table, suppose there are only two user tables in HBase: Table1 and Table2,Table1 are very large and are divided into many Region, so in. Meta. There are many Row entries in the table to record these Region. The Table2 is very small, but is divided into two Region, so in. Meta. Only two Row are used to record in. The contents of this table look like this:

.META. Row record structure

In this way, the client needs to access the-ROOT- table first. So you need to know the address of the RegionServer that manages the-ROOT- table. This address is stored in ZooKeeper. The default path is:

Java code

/ hbase/root-region-server

Wait, what if the-ROOT- table is too large and needs to be divided into multiple Region? Hey, HBase thinks that the-ROOT- table will not be that big, so-ROOT- will only have one Region, and the information of this Region is also stored inside the HBase.

Now let's start from scratch, we will query the data in Table2 where RowKey is RK10000. The main code of the whole routing process is in org.apache.hadoop.hbase.client.HConnectionManager.TableServers:

Java code

Private HRegionLocation locateRegion (final byte [] tableName

Final byte [] row, boolean useCache) throws IOException {

If (tableName = = null | | tableName.length = = 0) {

Throw new IllegalArgumentException ("table name cannot be null or zero length")

}

If (Bytes.equals (tableName, ROOT_TABLE_NAME)) {

Synchronized (rootRegionLock) {

/ / This block guards against two threads trying to find the root

/ / region at the same time. One will go do the find while the

/ / second waits. The second thread will not do find.

If (! useCache | | rootRegionLocation = = null) {

This.rootRegionLocation = locateRootRegion ()

}

Return this.rootRegionLocation

}

} else if (Bytes.equals (tableName, META_TABLE_NAME)) {

Return locateRegionInMeta (ROOT_TABLE_NAME, tableName, row, useCache, metaRegionLock)

} else {

/ / Region not in the cache-have to go to the meta RS

Return locateRegionInMeta (META_TABLE_NAME, tableName, row, useCache, userRegionLock)

}

}

This is a recursive process of calling:

Java code

Get RegionServer = > get RK10000 RegionServer = > get .meta., RowKey = Table2,RK10000,99999999999999 RegionServer = > get-ROOT-,RowKey = .meta., RegionServer = > get-ROOT- RegionServer = > get-ROOT- RegionServer = > from-ROOT- table to find a Row of RowKey closest to (less than) meta, Table2,RK10000,99999999999999,99999999999999, and get .meta. RegionServer = > from .meta. It is found in the table that RowKey is closest to (less than) a Row of Table2,RK10000, 999999999999, and gets the RegionServer of Table2 = > the Row of RK10000 from Table2.

So far, Client has completed the whole process of routing RegionServer, using the method of adding "9999999999999999" suffix and finding the closest (less than) RowKey. You can speculate on this method carefully, but it is not very difficult to understand.

Finally, I would like to remind you of two things:

1. MasterServer is not involved in the whole routing process, that is to say, the daily data operation of HBase does not need MasterServer, which will not cause the burden of MasterServer.

2. Client does not do the whole routing process for every data operation, and a lot of data will be Cache. As for how to Cache, it is beyond the scope of this article.

At this point, the study on "what is the structure of HBase ROOT and META tables" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report