Hadoop operation and maintenance record series (24) 07/10 Update SLTechnology News&Howtos

Hadoop operation and maintenance record series (24)

2025-07-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Start with this article to document cluster migration

Earlier, because there was no room for the computer room, I had already started planning for the cluster to move the computer room. Recently, I finally started to move it. I will record the main problems and contradictions encountered in the process of this non-downtime migration and various solutions.

The cluster size is not large, hundreds of units, and the total capacity is about 30PB. Hadoop uses CDH 5.5.1 plus some custom patches for rpm packaged build versions.

The overall solution is to cluster without downtime, set up a dedicated line between the two computer rooms, decommission the old computer room, and pull it to the new computer room. Do not offline too many machines every day, to ensure that the calculation.

90 machines were set up in advance in the new computer room to test the bandwidth. The bandwidth test method is relatively simple and crude, that is, take dozens of new machines in the new computer room to build a cluster, and then do distcp between the old computer room cluster and the new computer room cluster, the bandwidth can be full.

The new computer room is dismantled from the small cluster and merged into the large cluster. The rack awareness is divided according to the mode of "/computer room/rack", and then a period of time balancer is done in advance.

Of course, there will be some problems found in this, I recorded in the previous blog, here to repeat the record.

10 Gigabit network card MTU problem, new machine room datanode-slow block receiver problem, adjust the network card MTU from 1500 to 9000, solve.

df command stuck, upgrade systemd and restart to solve

Running slow, hit Centos 7 CPU patch restart

Broadcast storm problem, traffic is relatively large, check the network port each machine hundreds of ARP notifications per second, there is no solution, subsequent VLAN operation and maintenance solution.

For details, see Operation and Maintenance Record Series XXIII https://blog.51cto.com/slaytanic/2141665

During this period, I also encountered the problem of slow resolution speed. I adjusted the dn parameter dfs.datanode.max.transfer.threads to 16384, and then adjusted the NN parameter dfs.namenode.replication.max-streams-hard-limit parameter to 8. It doesn't feel much faster. Then I looked at the source code. In src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java this file

public synchronized List chooseUnderReplicatedBlocks(int blocksToProcess) { // initialize data structure for the return value List blocksToReplicate = new ArrayList(LEVEL); for (int i = 0; i < LEVEL; i++) { blocksToReplicate.add(new ArrayList()); } if (size() == 0) { // There are no blocks to collect. return blocksToReplicate; } int blockCount = 0; for (int priority = 0; priority < LEVEL; priority++) { // Go through all blocks that need replications with current priority. BlockIterator neededReplicationsIterator = iterator(priority); Integer replIndex = priorityToReplIdx.get(priority); // skip to the first unprocessed block, which is at replIndex for (int i = 0; i < replIndex && neededReplicationsIterator.hasNext(); i++) { neededReplicationsIterator.next(); } blocksToProcess = Math.min(blocksToProcess, size()); if (blockCount == blocksToProcess) { break; // break if already expected blocks are obtained } // Loop through all remaining blocks in the list. while (blockCount < blocksToProcess && neededReplicationsIterator.hasNext()) { Block block = neededReplicationsIterator.next(); blocksToReplicate.get(priority).add(block); replIndex++; blockCount++; } if (! neededReplicationsIterator.hasNext() && neededReplicationsIterator.getPriority() == LEVEL - 1) { // reset all priorities replication index to 0 because there is no // recently added blocks in any list. for (int i = 0; i < LEVEL; i++) { priorityToReplIdx.put(i, 0); } break; } priorityToReplIdx.put(priority, replIndex); } return blocksToReplicate;}

To get the index of the data block to be copied from here, we just need to change Integer replIndex = probityToReplIdx.get(priority); to Integer replIndex = 0;.

But I had to recompile it just to change one line of code, so it wasn't worth it. So colleagues found a tool called byteman, jboss company produced memory modifier, can be understood as a memory modification for Java Jinshan Ranger.

Please refer to my colleague's github for specific modifications, so I won't repeat them. However, in case the memory changed crash, I assume no responsibility.

https://github.com/whitelilis/whitelilis.github.io/issues/17

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.