HBase And MapReduce analysis with examples 04/24 Update SLTechnology News&Howtos

HBase And MapReduce analysis with examples

2025-04-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "HBase And MapReduce example Analysis". In daily operation, I believe many people have doubts about HBase And MapReduce example analysis problems. The editor consulted all kinds of data and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "HBase And MapReduce example Analysis". Next, please follow the editor to study!

There are multiple file contents under a directory file in HDFS. After inverted indexing, the results are written to a table in HBase. The code is as follows:

1.InvertedIndexMapper

Public class InvertedIndexMapper extends Mapper {private Text keyInfo = new Text (); / store the combination of words and URI private Text valueInfo = new Text (); / / store word frequency private FileSplit split; / / store split objects. @ Override protected void map (Object key, Text value, Mapper.Context context) throws IOException, InterruptedException {System.out.println ("key-- >:" + key + "\ n value-- >:" + value); / / get the FileSplit object to which you belong. Split = (FileSplit) context.getInputSplit (); System.out.println ("split--- >" + split.toString ()); / / System.out.println ("value.tostring ()-- >" + value.toString ()); StringTokenizer itr = new StringTokenizer (value.toString ()) The while (itr.hasMoreTokens ()) {/ / key value consists of words and URI. KeyInfo.set (itr.nextToken () + ":" + split.getPath () .toString ()); / / System.out.println ("split.getPath () .toString ()->" + split.getPath () .toString ()); / / initial word frequency is 1 valueInfo.set ("1") Context.write (keyInfo, valueInfo);}}

2.InvertedIndexCombiner

Public class InvertedIndexCombiner extends Reducer {private Text info = new Text (); @ Override protected void reduce (Text key, Iterable values, Reducer.Context context) throws IOException, InterruptedException {/ / Statistical word frequency int sum = 0 For (Text value: values) {sum + = Integer.parseInt (value.toString ());} int splitIndex = key.toString () .indexOf (":"); / / reset the value value consisting of URI and word frequency info.set (key.toString (). Substring (splitIndex + 1) + ":" + sum) / / reset the key value to the word key.set (key.toString () .substring); context.write (key, info);}}

3.InvertedIndexReducer

Public class InvertedIndexReducer extends Reducer {private Text result = new Text (); @ Override protected void reduce (Text key, Iterable values, Reducer.Context context) throws IOException, InterruptedException {/ / generate document list String fileList = new String () For (Text value: values) {fileList + = value.toString () + ";} result.set (fileList); context.write (key, result);}}

4.HBaseAndInvertedIndex

Public class HBaseAndInvertedIndex {private static Path outPath; public static void main (String [] args) throws Exception {run (); System.out.println ("\ n\ nthanks thanks *"); runHBase () } public static void run () throws Exception {Configuration conf = new Configuration (); Job job = Job.getInstance (conf, "Hadoop-InvertedIndex"); job.setJarByClass (HBaseAndInvertedIndex.class); / / implements the map function, which generates intermediate results based on the pairs of inputs. Job.setMapperClass (InvertedIndexMapper.class); job.setMapOutputKeyClass (Text.class); job.setMapOutputValueClass (Text.class); job.setCombinerClass (InvertedIndexCombiner.class); job.setReducerClass (InvertedIndexReducer.class); job.setOutputKeyClass (Text.class); job.setOutputValueClass (Text.class) FileInputFormat.addInputPath (job, new Path ("hdfs://192.168.226.129:9000/txt/invertedindex/")); DateFormat df = new SimpleDateFormat ("yyyyMMddHHmmssS"); String filename = df.format (new Date ()); outPath = new Path ("hdfs://192.168.226.129:9000/rootdir/invertedindexhbase/result/" + filename+ "/") FileOutputFormat.setOutputPath (job, outPath); int result = job.waitForCompletion (true)? 0: 1;} public static void runHBase () throws Exception {Configuration conf = new Configuration (); conf = HBaseConfiguration.create (conf) Conf.set ("hbase.zookeeper.quorum", "192.168.226.129"); Job job = Job.getInstance (conf, "HBase-InvertedIndex"); job.setJarByClass (HBaseAndInvertedIndex.class); job.setInputFormatClass (KeyValueTextInputFormat.class); job.setMapOutputKeyClass (Text.class) Job.setMapOutputValueClass (Text.class); / / write data to Hbase database FileInputFormat.addInputPath (job, new Path (outPath.toString () + "/ part-r-00000"); System.out.println ("path--- >" + outPath.toString ()) TableMapReduceUtil.initTableReducerJob ("invertedindex", InvertedIndexHBaseReducer.class, job); / / write data to HBase database / / first check the table for the existence of checkTable (conf); System.exit (job.waitForCompletion (true)? 0: 1) } private static void checkTable (Configuration conf) throws Exception {Connection con = ConnectionFactory.createConnection (conf); Admin admin = con.getAdmin (); TableName tn = TableName.valueOf ("invertedindex"); if (! admin.tableExists (tn)) {HTableDescriptor htd = new HTableDescriptor (tn) HColumnDescriptor hcd = new HColumnDescriptor ("indexKey"); htd.addFamily (hcd); admin.createTable (htd); System.out.println ("Table does not exist, new table created successfully");}} / * 1. Because map fetches data from hdfs, it hasn't changed much. And reduce needs to output the result to hbase, * so here inherits TableReduce, there is no valueout, * but specifies that the valueout of TableReduce must be Put or Delete instance * * 2.ImmutableBytesWritable: it is a byte sequence that can be used as a key or value type * * / public static class InvertedIndexHBaseReducer extends TableReducer {@ Override protected void reduce (Text key, Iterable values, Reducer.Context context) throws IOException InterruptedException {System.out.println ("key--- >" + key.toString ()) / / pay attention to the writing of row key parameters. Put put = new Put (key.toString (). GetBytes ()); put.addColumn (Bytes.toBytes ("indexKey"), Bytes.toBytes ("indexUrlWeight"), values.iterator (). Next (). GetBytes (); context.write (new ImmutableBytesWritable (key.getBytes ()), put);}

/ / original data directory file:

Invertedindex1.txt

Hello I will Learning HadoopHDFS MapReduceOther I will Learning HBase

Invertedindex2.txt:

Hello HBaseMapReduce HDFS

View the result: scan:

Hbase (main): 002 scan 0 > invertedindex'ROW COLUMN+CELL HBase column=indexKey:indexUrlWeight, timestamp=1463578091308, value=hdfs://192.168.226.129:900 0/txt/invertedindex/invertedindex2.txt:1 Hdfs://192.168.226.129:9000/txt/invertedindex/in vertedindex1.txt:1 HDFS column=indexKey:indexUrlWeight, timestamp=1463578091308, value=hdfs://192.168.226.129:900 0/txt/invertedindex/invertedindex1.txt:1 Hdfs://192.168.226.129:9000/txt/invertedindex/in vertedindex2.txt:1 Hadoop column=indexKey:indexUrlWeight, timestamp=1463578091308, value=hdfs://192.168.226.129:900 0/txt/invertedindex/invertedindex1.txt:1 Hdfs://192.168.226.129:9000/txt/invertedindex/in vertedindex2.txt:1 Hello column=indexKey:indexUrlWeight, timestamp=1463578091308, value=hdfs://192.168.226.129:900 0/txt/invertedindex/invertedindex1.txt:1 Hdfs://192.168.226.129:9000/txt/invertedindex/in vertedindex2.txt:1 I column=indexKey:indexUrlWeight, timestamp=1463578091308, value=hdfs://192.168.226.129:900 0/txt/invertedindex/invertedindex1.txt:2 Hdfs://192.168.226.129:9000/txt/invertedindex/in vertedindex2.txt:1 Learning column=indexKey:indexUrlWeight, timestamp=1463578091308, value=hdfs://192.168.226.129:900 0/txt/invertedindex/invertedindex1.txt:2 Hdfs://192.168.226.129:9000/txt/invertedindex/in vertedindex2.txt:1; MapReduce column=indexKey:indexUrlWeight, timestamp=1463578091308, value=hdfs://192.168.226.129:900 0/txt/invertedindex/invertedindex1.txt:1 Hdfs://192.168.226.129:9000/txt/invertedindex/in vertedindex2.txt:1 Other column=indexKey:indexUrlWeight, timestamp=1463578091308, value=hdfs://192.168.226.129:900 0/txt/invertedindex/invertedindex1.txt:1 Hdfs://192.168.226.129:9000/txt/invertedindex/in vertedindex2.txt:1 Will column=indexKey:indexUrlWeight, timestamp=1463578091308, value=hdfs://192.168.226.129:900 0/txt/invertedindex/invertedindex1.txt:2 Hdfs://192.168.226.129:9000/txt/invertedindex/in vertedindex2.txt:1; 9 row (s) in 0.2240 seconds so far, the study of "HBase And MapReduce example Analysis" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.