Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to repair hbase

2025-02-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you how to repair hbase, I believe that most people do not know much about it, so share this article for your reference, I hope you will learn a lot after reading this article, let's learn about it!

Meta table fix 1

Java Code Collection Code

Check the hbasemeta situation

Hbase hbck

1. Repair the hbase meta table (generate the meta table based on the regioninfo file on hdfs)

Hbase hbck-fixMeta

two。 Redistribute the hbase meta table to the regionserver (assign the region on the meta table to the regionservere according to the meta table)

Hbase hbck-fixAssignments

Meta table repair II

Java Code Collection Code

When there is a loophole

Hbase hbck-fixHdfsHoles (create a new region folder)

Hbase hbck-fixMeta (generate meta table based on regioninfo)

Hbase hbck-fixAssignments (assign region to regionserver)

Meta table repair 3

Java Code Collection Code

Missing regioninfo

Hbase hbck-fixHdfsOrphans

Hbase region allocation problem

Java Code Collection Code

Region can not be separated, but delete rmr / hbase/region-in-transition

Error in hbase region reference file

Java Code Collection Code

Found lingering reference file hdfs:

Hbase hbck-fixReferenceFiles

Repeat the region question:

Java Code Collection Code

View region in meta

Scan 'hbase:meta', {LIMIT= > 10pm filer = > "PrefixFilter (' INDEX_11')"}

Encountered two duplicate region during data migration

B0c8f08ffd7a96219f748ef14d7ad4f8,73ab00eaa7bab7bc83f440549b9749a3

Delete two duplicate region

Delete 'hbase:meta','INDEX_11,4380_2431,1429757926776.b0c8f08ffd7a96219f748ef14d7ad4f8.','info:regioninfo'

Delete 'hbase:meta','INDEX_11,5479_0041431700000000040100004815E9,1429757926776.73ab00eaa7bab7bc83f440549b9749a3.','info:regioninfo'

Delete two duplicate hdfs

/ hbase/data/default/INDEX_11/b0c8f08ffd7a96219f748ef14d7ad4f8

/ hbase/data/default/INDEX_11/73ab00eaa7bab7bc83f440549b9749a3

Corresponding restart regionserver (just to refresh the status of the RIS reported on the hmaster)

It is certain that the data will be lost, and the data on the duplicate region that is not online will be lost.

New hbase hbck

Java Code Collection Code

The new version of hbck

The new version of hbck can fix a variety of errors with the following options:

(1)-fix, backward compatible, replaced by-fixAssignments

(2)-fixAssignments, used to fix region assignments errors

(3)-fixMeta, which is used to fix the problem with the meta table, provided that the region info information on the HDFS is available and correct.

(4)-fixHdfsHoles to fix the region holes (void, there is no region in an interval) problem

(5)-fixHdfsOrphans, fix Orphan region (there is no region for .regioninfo on hdfs)

(6)-fixHdfsOverlaps, fix region overlaps (interval overlap) problem

(7)-fixVersionFile to fix the problem of missing hbase.version files

(8)-maxMerge (n defaults to 5). When the region overlaps, the region needs to be merged, and the maximum number of region merged at a time does not exceed this value.

(9)-sidelineBigOverlaps, when fixing region overlaps problems, some region with the largest number of overlaps with other region are allowed not to participate (after repair, non-participating data can be loaded into the corresponding region through bulk load)

(10)-maxOverlapsToSideline (n defaults to 2). When fixing region overlaps problems, the maximum number of region in a group is allowed not to participate

Because there are many options, there are two abbreviated options.

(11)-repair, equivalent to-fixAssignments-fixMeta-fixHdfsHoles-fixHdfsOrphans-fixHdfsOverlaps-fixVersionFile-sidelineBigOverlaps

(12)-repairHoles, equivalent to-fixAssignments-fixMeta-fixHdfsHoles-fixHdfsOrphans

New version of hbck

(1) missing hbase.version file

Add option-fixVersionFile solution

(2) if a region is neither in the META table nor above the hdfs, but in the online region collection of the regionserver

Add option-fixAssignments solution

(3) if a region is in the META table and in the online region collection of regionserver, but not on hdfs

Add the option-fixAssignments-fixMeta solution, (- fixAssignments tells regionserver close region), (- fixMeta deletes the region record in the META table)

(4) if a region is not recorded in the META table and is not serviced by regionserver, but on the hdfs

Add the option-fixMeta-fixAssignments solution, (- fixAssignments for assign region), (- fixMeta for adding region records to the META table)

(5) if a region is not recorded in the META table, it is on hdfs and is serviced by regionserver.

Add the option-fixMeta solution, add the record of this region to the META table, first undeploy region, then assign

(6) if a region is recorded in the META table but not on the hdfs and is not serviced by the regionserver

Add the option-fixMeta solution to delete the records in the META table

(7) if a region is recorded in the META table and also on the hdfs, table is not disabled, but the region is not serviced.

Add the option-fixAssignments solution, assign this region

(8) if a region is recorded in the META table and also on the hdfs, table is disabled, but the region is served by some regionserver

Add the option-fixAssignments solution, undeploy this region

(9) if a region is recorded in the META table and also on the hdfs, table is not disabled, but the region is served by multiple regionserver services.

Add the option-fixAssignments solution, notify all regionserver close region, then assign region

(10) if a region is in the META table and on the hdfs, it should also be served, but the regionserver recorded in the META table does not match the actual regionserver

Add option-fixAssignments solution

(11) region holes

You need to add-fixHdfsHoles to create a new empty region to fill the hole, but do not assign the region or add information about the region in the META table

(12) region does not have a .regioninfo file on hdfs

-fixHdfsOrphans solution

(13) region overlaps

Need to add-fixHdfsOverlaps

Description:

(1) when repairing region holes, the-fixHdfsHoles option simply creates a new empty region, fills this interval, and needs to add-fixAssignments-fixMeta to solve the problem, (- fixAssignments is used for assign region), (- fixMeta is used to add region records to the META table), so there is a combination of punches-repairHoles to repair region holes, which is equivalent to-fixAssignments-fixMeta-fixHdfsHoles-fixHdfsOrphans.

(2)-fixAssignments, which is used to fix the problem that region has no assign, should not assign and assign for many times.

(3)-fixMeta, if it is not on the hdfs, delete the corresponding record from the META table, and if it is on the hdfs, add the corresponding record information to the META table

(4)-repair opens all repair options, which is equivalent to-fixAssignments-fixMeta-fixHdfsHoles-fixHdfsOrphans-fixHdfsOverlaps-fixVersionFile-sidelineBigOverlaps

The new version of hbck obtains the relevant information of region's Table and Region from (1) hdfs directory (2) META (3) RegionServer, and judges and repair based on this information.

Transfer to meta and delete the table manually.

Java Code Collection Code

Because the cluster hard disk is tight, definitely add the COMPRESSION= > LZO attribute to the original table. But when creating tables, there is no feedback for a long time. It was decided that drop dropped this table, but drop always failed. Restart the cluster and the hbase 60010 interface displays region transaction. To create the failed table region, jump between PENDING_OPEN and CLOSED. Describe table failed, enable table failed, disable table failed, failed to view table from 60010 interface. It hurts.

Then decided to force the current table to be deleted. Google for a while, found this article, most of the articles are correct, but the last step is problematic. The command in the original text is:

Delete 'TrojanInfo','TrojanInfo,1361433390076.2636b5a2b3d3d08f23d2af9582f29bd8.','info:server'

At that time, I felt that there was a problem and did not involve .meta. Table, how to update META information?

Always report an error after two attempts to delete, make sure there should be a problem, just in case, google the operation of updating META information and change the command to

Delete '.meta.,' TrojanInfo,1361433390076.2636b5a2b3d3d08f23d2af9582f29bd8.','info:server'

The command was executed successfully.

After restarting the cluster, the transction still exists. The analysis should be that the meta table has not been updated. Do a major_compact on the meta table and restart the cluster successfully. No more false positives.

Here is a copy of the original text:

Force deletion of tables:

1. Force the deletion of all files on the table on hdfs (the path depends on the actual situation):

[sql] view plaincopy

. / hadoop fs-rmr / hbase/TrojanInfo

2. Delete the table in the HBase system table. META. Records in:

A, first of all, from .meta. Query the table TrojanInfo in. Meta. Rowkey in, which can be passed through scan '.meta.', and then manually filtered

B, and then delete the 3 fields under the rowkey (assuming that the queried rowkey is TrojanInfo,1361433390076.2636b5a2b3d3d08f23d2af9582f29bd8.)

[plain] view plaincopy

Delete 'TrojanInfo','TrojanInfo,1361433390076.2636b5a2b3d3d08f23d2af9582f29bd8.','info:server'

Delete 'TrojanInfo','TrojanInfo,1361433390076.2636b5a2b3d3d08f23d2af9582f29bd8.','info:serverstartcode'

Delete 'TrojanInfo','TrojanInfo,1361433390076.2636b5a2b3d3d08f23d2af9582f29bd8.','info:reg

Transfer to meta table for repair 3

Java Code Collection Code

First, the cause of failure

A server with an IP of 10.191.135.3 restarted on August 1, 2013, causing all services on this server to stop. This causes the NTP service to stop. When the NTP service is stopped, the clock of most machines in the HBase cluster is inconsistent with the host time, and the regionserver service is aborted. And after a reboot, the hole for region appears. The data needs to be repaired to provide normal services for inserting data.

Second, the mode of recovery

1. There are 50 regionserver in the cluster, 41 services are down, and unknown restart of the namenode machine 10.191.135.3 (searching for the reason) causes the namenode, zookeeper, and time synchronization server services on this machine to hang up.

2. When the hbase service is restarted, the remaining 9 regionserver services are not successfully stop, and the artificial kill process is carried out.

3. Remove the hlog on the hdfs (to avoid spending too much time on split log affecting the service during startup), and then restart hbase. It was found that the time on the 10.191.135.30 machine was out of sync with the time synchronization server 10.191.135.3. Restart successfully after manual synchronization. Hbase can provide query service normally.

4. Run mapreduce put data. An exception is thrown and the data cannot be inserted normally

5. Execute / opt/hbase/bin/hbase hbck-fixAssignments to try to reassign the region. The results show that there is a hole in the hbase, that is, the data between the region is discontinuous.

6. Through the above operations, you can locate that the data was lost during the restart after the regionserver service was down. The hole needs to be repaired. However, the hbase hbck command always shows only three holes.

7. Through the written regionTest.jar tool to further detect the regionname where the hole is located, then stop the hbase, and then merge the region to repair the hole.

8. The merged merge operation needs to go to .meta. The information of the region is read in the table, because. Meta. The table was also damaged during regionserver downtime, so part of region. Meta. If there is no information, a null pointer exception is thrown during the merge operation. Therefore, the region of hdfs can only be removed, and then the regionname where the new hole is located can be detected by regionTest.jar, and the hole can be repaired by merging.

9. About region overlap, that is, regionname exists. Meta. Table, but was mistakenly removed on hdfs and merged with region. In this case, you need to detect overlapping regionname through regionTest.jar and then manually go to .meta. Table deletion,. META. Flush is required after table modification

10. Finally, execute the hbase hbck command again and hbase all the tables status ok.

Third, related commands and page error information

1. Manual synchronization time command? service ntpd stop?ntpdate-d 192.168.1.20?service ntpd start

2.org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 2 actions: WrongRegionException: 2 times, servers with issues: datanode10:60020 ? at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback (HConnectionManager.java:1641)? at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch (HConnectionManager.java:1409)? at org.apache.hadoop.hbase.client.HTable.flushCommits (HTable.java:949)? at org.apache.hadoop.hbase.client.HTable.doPut (HTable.java:826)? at org.apache.hadoop.hbase.client.HTable.put (HTable.java:801)? at org.apache.hadoop.hbase .mapreduce.TableOutputFormat $TableRecordWriter.write (TableOutputFormat.java:123)? at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write (TableOutputFormat.java:84)? at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write (MapTask.java:533)? at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write (TaskInputOutputContextImpl.java:88)? at o

3.13 DEBUG util.HBaseFsck: There are 22093 region info entries?ERROR: There is a hole in the region chain between + 8615923208069cmnet201303072132166264580 and + 861592321. You need to create a new. Regioninfo and region dir in hdfs to plug the hole.?ERROR: There is a hole in the region chain between + 8618375993383cmwap20130512235639430 and + 8618375998629cmnet201305040821436779670. You need to create a new. Regioninfo and region dir in hdfs to plug the hole.?ERROR: There is a hole in the region chain between + 8618725888080cmnet201212271719506311400 and + 8618725889786cmnet201302131646431671140. You need to create a new .regioninfo and region dir in hdfs to plug the hole.?ERROR: Found inconsistency in table cqgprs?Summary:?-ROOT- is okay.? Number of regions: 1? Deployed on: datanode14,60020,1375330955915? .META. Is okay.? Number of regions: 1? Deployed on: datanode21,60020,1375330955825? Cqgprs is okay.? Number of regions: 22057? Deployed on: datanode1,60020,1375330955761 datanode10,60020,1375330955748 datanode11,60020,1375330955736 datanode12,60020,1375330955993 datanode13,60020,1375330955951 datanode14,60020,1375330955915 datanode15,60020,1375330955882 datanode16,60020,1375330955892 datanode17,60020,1375330955864 datanode18,60020,1375330955703 datanode19,60020,1375330955910 datanode2,60020,1375330955751 datanode20,60020,1375330955849 datanode21,60020,1375330955825 datanode22,60020,1375334479752 datanode23,60020,1375330955835 datanode24,60020,1375330955932 datanode25,60020,1375330955856 datanode26,60020,1375330955807 datanode27,60020,1375330955882 datanode28,60020,1375330955785 datanode29,60020,1375330955799 datanode3,60020,1375330955778 datanode30,60020,1375330955748 datanode31,60020,1375330955877 datanode32,60020,1375330955763 datanode33,60020,1375330955755 datanode34,60020,1375330955713 datanode35,60020,1375330955768 datanode36,60020,1375330955896 datanode37,60020,1375330955884 datanode38,60020,1375330955918 datanode39,60020,1375330955881 datanode4,60020,1375330955826 datanode40,60020,1375330955770 datanode41,60020,1375330955824 datanode42,60020,1375449245386 datanode43,60020,1375330955880 datanode44,60020,1375330955902 datanode45,60020,1375330955881 datanode46,60020,1375330955841 datanode47,60020,1375330955790 datanode48,60020,1375330955848 datanode49,60020,1375330955849 datanode5,60020,1375330955880 datanode50,60020,1375330955802 datanode6,60020,1375330955753 datanode7,60020,1375330955890 datanode8,60020,1375330955967 datanode9,60020,1375330955948? Test1 is okay.? Number of regions: 1? Deployed on: datanode43,60020,1375330955880? Test2 is okay.? Number of regions: 1? Deployed on: datanode21,60020,1375330955825?35 inconsistencies detected.?Status: INCONSISTENT

4.hadoop jar regionTest.jar com.region.RegionReaderMain / hbase/cqgprs detects the regionname where the hole in the cqgprs table is located.

5.==?first endKey = + 8615808059207cmnet201307102326567966800?second startKey = + 8615808058578cmnet201212251545557984830??first regionNmae = cqgprs,+8615808058578cmnet201212251545557984830,1375241186209.0f8266ad7ac45be1fa7233e8ea7aeef9.?second regionNmae = cqgprs,+8615808058578cmnet201212251545557984830,1362778571889.3552d3db8166f421047525d6be39c22e.?==?first endKey = + 8615808060140cmnet201303051801355846850?second startKey = + 8615808059207cmnet201307102326567966800??first regionNmae = cqgprs,+8615808058578cmnet201212251545557984830,1362778571889.3552d3db8166f421047525d6be39c22e.?second regionNmae = cqgprs,+8615808059207cmnet201307102326567966800,1375241186209.09d489d3df513bc79bab09cec36d2bb4.?==

6.Usage: bin/hbase org.apache.hadoop.hbase.util.Merge [- Dfs.default.name=hdfs://nn:port]?. / hbase org.apache.hadoop.hbase.util.Merge-Dfs.defaultFS=hdfs://bdpha cqgprs cqgprs,+8615213741567cmnet201305251243290802280,1369877465524.3c13b460fae388b1b1a70650b66c5039. Cqgprs,+8615213745577cmnet201302141725552206710,1369534940433.5de80f59071555029ac42287033a4863. &

7.13 08 WARN util.HBaseFsck 01 22:24:02 WARN util.HBaseFsck: Naming new problem group: + 8618225125357cmnet201212290358070667800?ERROR: (regions cqgprs,+8618225123516cmnet201304131404096748520,1375363774655.b3cf5cc752f4427a4e699270dff9839e. And cqgprs,+8618225125357cmnet201212290358070667800,1364421610707.7f7038bfbe2c0df0998a529686a3e1aa.) There is an overlap in the region chain.?13/08/01 22:24:02 WARN util.HBaseFsck: reached end of problem group: + 8618225127504cmnet201302182135452100210?13/08/01 22:24:02 WARN util.HBaseFsck: Naming new problem group: + 8618285642723cmnet201302031921019768070?ERROR: (regions cqgprs,+8618285277826cmnet201306170027424674330,1375363962312.9d1e93b22cec90fd75361fa65b1d20d2. And cqgprs,+8618285642723cmnet201302031921019768070,1360873307626.f631cd8c6acc5e711e651d13536abe94.) There is an overlap in the region chain.?13/08/01 22:24:02 WARN util.HBaseFsck: reached end of problem group: + 8618286275556cmnet201212270713444340110?13/08/01 22:24:02 WARN util.HBaseFsck: Naming new problem group: + 8618323968833cmnet201306010239025175240?ERROR: (regions cqgprs,+8618323967956cmnet201306091923411365860,1375364143678.665dba6a14ebc9971422b39e079b00ae. And cqgprs,+8618323968833cmnet201306010239025175240,1372821719159.6d2fecc1b3f9049bbca83d84231eb365.) There is an overlap in the region chain.?13/08/01 22:24:02 WARN util.HBaseFsck: reached end of problem group: + 8618323992353cmnet201306012336364819810?ERROR: There is a hole in the region chain between + 8618375993383cmwap20130512235639430 and + 8618375998629cmnet201305040821436779670. You need to create a new. Regioninfo and region dir in hdfs to plug the hole.?13/08/01 22:24:02 WARN util.HBaseFsck: Naming new problem group: + 8618723686187cmnet201301191433522129820?ERROR: (regions cqgprs,+8618723683087cmnet201301300708363045080,1375364411992.4ee5787217c1da4895d95b3b92b8e3a2. And cqgprs,+8618723686187cmnet201301191433522129820,1362003066106.70b48899cc753a0036f11bb27d2194f9.) There is an overlap in the region chain.?13/08/01 22:24:02 WARN util.HBaseFsck: reached end of problem group: + 8618723689138cmnet201301051742388948390?13/08/01 22:24:02 WARN util.HBaseFsck: Naming new problem group: + 8618723711808cmnet201301031139206225900?ERROR: (regions cqgprs,+8618723710003cmnet201301250809235976320,1375364586329.40eed10648c9a43e3d5ce64e9d63fe00. And cqgprs,+8618723711808cmnet201301031139206225900,1361216401798.ebc442e02f5e784bce373538e06dd232.) There is an overlap in the region chain.?13/08/01 22:24:02 WARN util.HBaseFsck: reached end of problem group: + 8618723714626cmnet201302122009459491970?ERROR: There is a hole in the region chain between + 8618725888080cmnet201212271719506311400 and + 8618725889786cmnet201302131646431671140. You need to create a new .regioninfo and region dir in hdfs to plug the hole.

8. Delete '.meta.,' regionname','info:serverstartcode'

Delete '.meta.,' regionname','info:regionserver'

Delete '.meta.,' regionname','info:regioninfo'

9. Flush '.meta.'? major_compact '.meta.'

The above is all the content of this article "how to repair hbase". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report