How to solve the problem of migrating Hadoop0.20.2 07/12 Update SLTechnology News&Howtos

How to solve the problem of migrating Hadoop0.20.2

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article will give you a detailed explanation on how to solve the problem of migrating Hadoop0.20.2. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.

1. Question:

It is also three months since the release of Hadoop0.20.2, and I have been using a version of Hadoop provided by http://www.cloudera.com/, which is also based on Hadoop0.18.3 because it is a relatively stable version. But recently, when I was using hypertable0.9.2.7, I found that HyperspaceCOMMalreadycommected always appeared in my local jni call. I checked the reason. I found that the COMM of hyperspace was occupied and the connection was wrong. I looked for it on the Internet. The author also said that there was this problem, and that it was not very difficult to modify it. I took a look at its source code, it was thrown by the socket link, and if I want to change it, I have to modify the code of the hyperspace module. Because the bottom layer of hyperspace uses oracle's berkeleydb, and I'm not very familiar with it, so I don't change it. I want to upgrade it directly to 0.9.3.1 to see if it has solved this problem, but I'm disappointed that it still hasn't solved this problem, and it has made a lot of changes on its thrift side. And the TableSplit of the hypertable table is also added to its thrift server, which is exactly what I want, hehe, this can also bypass the previous hyperspace problem, because it only generates a HypertableClient on the thrift server, so there will not be the problem of COMMconnected, and its Cell has also been changed a lot, using * Hadoop0.20.2. There is no way, to rise together, Hadoop0.18.3- > Hadoop0.20.2;hypertable0.9.2.7- > hypertable0.9.3.1, the original TableInputFomat and TableOutputFormat seem to be modified, so I have the following feelings.

Some changes in 2.Hadoop0.20.2

The new version has changed a lot in terms of directory structure and API, no matter from 0.18 to 0.19, or from 0.19 to 0.20. It feels like modularity is getting stronger and clearer.

2.1 changes in directory structure

There are three main directories, core,hdfs,mapred.

◆ mainly extracts the original shared functions to core, including conf,fs,io,ipc,net,record and so on. Permissions similar to the unix directory have also been added.

◆ put hdfs into a separate directory, and extracted the configuration file of hdfs, called hdfs-default.xml, in which the hdfs directory is divided into protocol, providing some client-side communication protocols, as well as server and tools directories, in which the server directory is divided into balancer,common,datanode,namenode,protocol, where the protocol directory provides communication protocols between DataNode and NameNode, as well as communication protocols between DataNode.

◆ also isolated mapred, and also extracted the configuration file of mapred, put it into mapred-default.xml, it also has two subdirectories, one is mapred, in which there are some core classes of mapreduce and some classes of Deprecated for backward compatibility, but these interfaces and classes are generally not recommended. Another directory is the mapreduce directory, where there are some external abstract classes and interfaces for extension according to your own needs. There is a directory called lib in this directory, which provides some common input,output,map,reduce methods provided by the framework.

Changes in 2.2API

In Hadoop0.20.2, the change of API is also great, mainly turning some interfaces into abstract classes, in order to improve the scalability, do some refactoring, the change is still quite big. Here is an example to illustrate the changes here.

An example of 2.2.1Hadoop

This is an example of WordCount in Hadoop, from which you can find changes in the interface between Map and Reduce, as well as changes in JobClient.

Some changes in 3.Hypertable0.9.3.1

The main feeling is that its thrift java client has changed a lot. In order to support mapreduce, many things are integrated into the thrift server. Joined the replication of MapReduceconnector,Hyperspace, and DUMPTABLE and so on. It is convenient to add InputFormat and OutputFormat to its thrift client, as well as TableSplit, which can be used to read Key and Value pairs of tables in Hypertable. But it does not deal with the range_location after TableSplit, but uses "localhost" to connect Host. I don't know why?

It seems that you want to use kfs in Hypertable or compile the source code and the dynamic library of kfs.

This is the end of this article on "how to solve the problem of migrating Hadoop0.20.2". I hope the above content can be helpful to you, so that you can learn more knowledge. if you think the article is good, please share it out for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.