In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
1. Today, I encountered a situation, that is, alluxio can not be accessed normally. After checking the log, I found the following error.
2018-05-14 03 try to open hdfs://sandy-bridge/user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000001 35 ERROR logger.type 58680 ERROR logger.type (HdfsUnderFileSystem.java:open)-4 try to open hdfs://sandy-bridge/user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000001: Cannot obtain block length for LocatedBlock {BP-1941630157-10.16.13.73-1486732586674Blkley 1322900685 "252817168; getBlockSize () = 254; corrupt=false; offset=0; locs= [10.16.13.189v1019,10.16.13.841019,10.16.13.1281019] StorageIDs= [DS-30126b4d-afdf-449a-8de1-e479c1abf33d, DS-ed2e905e-fa43-4f51-801f-3305da180d2a, DS-0e1946c8-dccb-4143-8d74-c11d8d429d02]; storageTypes= [DISK, DISK, DISK]} java.io.IOException: Cannot obtain block length for LocatedBlock {BP-1941630157-10.16.13.73-1486732586674Blkley 1322900685 "252817168; getBlockSize () = 254; corrupt=false; offset=0; locs= [10.16.13.189purpur1019,10.16.13.84DS-ed2e905e-fa43 1019,10.16.13.84purl1019] StorageIDs= [DS-30126b4d-afdf-449a-8de1-e479c1abf33d, DS-ed2e905e-fa43-4f51-801f-3305da180d2a, DS-0e1946c8-dccb-4143-8d74-c11d8d429d02] StorageTypes= [DISK, DISK DISK]} at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength (DFSInputStream.java:400) at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength (DFSInputStream.java:305) at org.apache.hadoop.hdfs.DFSInputStream.openInfo (DFSInputStream.java:242) at org.apache.hadoop.hdfs.DFSInputStream. (DFSInputStream.java:235) at org.apache.hadoop.hdfs.DFSClient.open (DFSClient.java:1487) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall (DistributedFileSystem.java:302) at org.apache .hadoop.hdfs.DistributedFileSystem $3.doCall (DistributedFileSystem.java:298) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve (FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open (DistributedFileSystem.java:298) at org.apache.hadoop.fs.FileSystem.open (FileSystem.java:766) at alluxio.underfs.hdfs.HdfsUnderFileSystem.open (HdfsUnderFileSystem.java:387) at alluxio.underfs.BaseUnderFileSystem.open (BaseUnderFileSystem.java:124) at alluxio.master.journal.JournalReader.getNextInputStream (JournalReader.java:114) ) at alluxio.master.journal.JournalTailer.processNextJournalLogFiles (JournalTailer.java:118) at alluxio.master.AbstractMaster.start (AbstractMaster.java:140) at alluxio.master.file.FileSystemMaster.start (FileSystemMaster.java:419) at alluxio.master.DefaultAlluxioMaster.startMasters (DefaultAlluxioMaster.java:263) at alluxio.master.FaultTolerantAlluxioMaster.start (FaultTolerantAlluxioMaster.java:91) at alluxio.ServerUtils.run (ServerUtils.java:38)
two。 First of all, it is suspected that the file log.00000000000000000001 is corrupted. After hfs fsck's inspection, no corruption is found, but Total size: 0, which is a problem.
[hdfs@hdfs-namenode hdfs] $hdfs fsck / user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000001Connecting to namenode via http://hdfs-namenode.eu-central-1.compute.internal:50070/fsck?ugi=hdfs&path=%2Fuser%2Falluxio%2Fjournal%2FFileSystemMaster%2Fcompleted%2Flog.00000000000000000001FSCK started by hdfs (auth:KERBEROS_SSL) from / 10.16.13.73 for path / user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000001 at Mon May 14 03:53:11 UTC 2018Status: HEALTHYTotal size: 0 B ( Total open files size: 254B) Total dirs: 0Total files: 0Total symlinks: 0 (Files currently being written: 1) Total blocks (validated): 0 (Total open file blocks (not validated): 1) Minimally replicated blocks: 0Over-replicated blocks: 0Under-replicated blocks: 0Mis-replicated blocks: 0Default replication factor: 3Average block replication: 0.0Corrupt blocks: 0Missing replicas: 0Number of data-nodes: 41Number of racks: 1FSCK ended at Mon May 14 03:53:11 UTC 2018 in 1 millisecondsThe filesystem under path'/ user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000001' is HEALTHY
3. Leave the problem mv, and then start alluxio HA master, and start it successfully.
[hdfs@hdfs-namenode hdfs] $hdfs dfs- mv / user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000001 / user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000001.bak [hdfs@hdfs-namenode hdfs] $hdfs dfs- ls / user/alluxio/journal/FileSystemMaster/completed/Found 2 items-rw-r--r-- 3 alluxio alluxio 2542018-01-29 09:32 / user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000001.bak-rw-r--r-- 3 alluxio alluxio 397 2018-05-14 03:03 / user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000002
4. I tried to mv the file back, but alluxio still failed to start, which was the initial error.
5. Directly cat this file, found that it can not be accessed.
Hdfs dfs-cat / user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000001.bakcat: Cannot obtain block length for LocatedBlock {BP-1941630157-10.16.13.73-1486732586674purl BLK 1322900685 "252817168; getBlockSize () = 254; corrupt=false; offset=0 Locs= [DatanodeInfoWithStorage [10.16.13.189], DatanodeInfoWithStorage [10.16.13.189], DatanodeInfoWithStorage [10.16.13.84], DatanodeInfoWithStorage [10.16.13.84], DatanodeInfoWithStorage [10.16.13.84]
6. For a normal file, the output is as follows:
[hdfs@hdfs-namenode hdfs] $hdfs dfs- cat / user/alluxio/journal/FileSystemMaster/completed/log.00000000000000000002NOT_PERSISTED (0meme @ HPXhdatadownloadz_20180510130731077.zip "NOT_PERSISTED (0meme @ HPXhdatadownloadzdatadownload Z 6Perrier_%3F%3F_20180101_20180104_20180510130731077.zip" NOT_PERSISTED (0meme @ HPXhdatadownloadzdatadownload Z 6Perrier_%3F%3F_20180101_20180104_20180510130731077.zip "NOT_PERSISTED (0meme @ HPXhdatadownloadzdatadownload Z 6Perrier_%3F%3F_20180101_20180104_20180510130731077.zip" NOT_PERSISTED (0 @ HPXhdatadownloadz datadownload Z 6Perrier_%3F%3F_20180101_20180104_20180510130731077.zip "
7. Alluxio master started successfully, but some data was lost.
If there is time for this problem, we should continue to study it to see if the data can be recovered.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.