Deploy user-defined Observer Coprocessor in Hbase0.98.4 05/10 Update SLTechnology News&Howtos

Deploy user-defined Observer Coprocessor in Hbase0.98.4

2025-05-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Hbase has supported Coprocessor (coprocessor) since 0.92, which aims to enable users to run their code on regionserver, that is, to move the computing program to the location where the data is located. This is consistent with MapReduce's thinking. The Coprocess of Hbase is divided into two categories: observer and endpoint. To put it simply, observer is equivalent to a trigger in a relational database, while endpoint is equivalent to a stored procedure in a relational database. There are a lot of documents about the introduction of HBase Coprocessor online, and since I have just learned, I have learned a lot from the documents contributed by many good people.

Here is a record of the process of deploying a custom Coprocessor on a fully distributed system. This article describes two methods of deployment: one is to configure it in hbase-site.xml, and the second is to configure it using table descriptors (alter); the former is loaded by all region of all tables, while the latter only loads all region of the specified table. This paper will point out what are the error-prone points in combination with our own experimental process.

First, let's take a look at the environment:

Hadoop1.updb.com 192.168.0.101 Role:master

Hadoop2.updb.com 192.168.0.102 Role:regionserver

Hadoop3.updb.com 192.168.0.103 Role:regionserver

Hadoop4.updb.com 192.168.0.104 Role:regionserver

Hadoop5.updb.com 192.168.0.105 Role:regionserver

First encode the custom Coprocessor, which is excerpted from the authoritative Guide to Hbase, but changes the name of package:

/ * coprocessor * when the user uses the get command to fetch a specific row from the table, the custom observer coprocessor will be triggered * the trigger condition is that the user uses the rowkey specified by get and the FIXED_ROW specified in the program is @ GETTIME@@@ * the action after trigger is that the program will generate an keyvalue instance on the server and return the instance to the client. This kv instance takes * @ GETTIME@@@ as the rowkey, column family and column identifier @ GETTIME@@@, column value as the server-side time * / package org.apache.hbase.kora.coprocessor;import java.io.IOException;import java.util.List;import org.apache.commons.logging.Log;import org.apache.commons.logging.LogFactory;import org.apache.hadoop.hbase.KeyValue;import org.apache.hadoop.hbase.client.Get Import org.apache.hadoop.hbase.coprocessor.BaseRegionObserver;import org.apache.hadoop.hbase.coprocessor.ObserverContext;import org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment;import org.apache.hadoop.hbase.regionserver.HRegion;import org.apache.hadoop.hbase.util.Bytes;public class RegionObserverExample extends BaseRegionObserver {public static final Log LOG = LogFactory.getLog (HRegion.class); public static final byte [] FIXED_ROW = Bytes.toBytes ("@ GETTIME@@@") @ Override public void preGet (ObserverContext c, Get get, List result) throws IOException {LOG.debug ("Got preGet for row:" + Bytes.toStringBinary (get.getRow ()); if (Bytes.equals (get.getRow (), FIXED_ROW)) {KeyValue kv = new KeyValue (get.getRow (), FIXED_ROW, FIXED_ROW, Bytes.toBytes (System.currentTimeMillis () LOG.debug ("Had a match, adding fake kv:" + kv); result.add (kv);}

After the coding is completed, the class needs to be compiled and packed into a jar package. Right-click on the class name-- Export, and the following window pops up

Select JAR file, then Next, and the following window appears

Specify the path to save the jar file, and then finish, complete the compilation and packaging of the RegionObserverExample class. Next, you need to upload the typed jar file to the master server of the hbase cluster using ftp, in this case hadoop1.

# # uploaded to hadoop1 [grid@hadoop1 ~] $ls / var/ftp/pub/RegionObserverExample.jar / var/ftp/pub/RegionObserverExample.jar## because it is a completely distributed system, in order to facilitate management, we store the jar package in the jars directory under the root directory of hadoop hdfs [grid@hadoop1 ~] $hdfs dfs-put / var/ftp/pub/RegionObserverExample.jar / jars## OK Verify that it has been uploaded successfully [grid@hadoop1 ~] $hdfs dfs-ls / jars Found 1 items-rw-r--r-- 4 grid supergroup 3884 2014-11-15 04:46 / jars/RegionObserverExample.jar

Then you need to put the typed jar package into the lib directory under the hbase installation directory, and modify the hbase-site.xml configuration file

# # cp jar package to the lib directory under the hbase installation directory, do this on all nodes [grid@hadoop1 ~] $cp / var/ftp/pub/RegionObserverExample.jar / opt/hbase-0.98.4-hadoop2/lib/##, and then modify the hbase-site.xml file to add an option hbase.coprocessor.region.classes org.apache.hbase.kora.coprocessor.RegionObserverExample

After modifying the configuration file on master, scp the modified file to another regionserver, and then restart hbase to make the configuration effective. Let's see if it can be triggered correctly after reboot.

# # use the get command to fetch the row HBASE (main): 014GETTIME@@@ 0 > get 'kora',' @ @ GETTIME@@@'COLUMN CELL @ GETTIME@@@:@@@GETTIM timestamp=9223372036854775807 from the kora table with @ GETTIME@@@ Value=\ X00\ x01I\ xB0@\ xA0\ xE0 estrangement @ 1 row (s) in 0.0420 seconds## converts the column value to uninx time hbase (main): 015xE0 0 > Time.at ("\ X00\ X00\ x01I\ xB0\ x0BZ\ x0B" .to _ java _ bytes) / 1000) = > Sat Nov 15 04:42:54 + 0800 2014 # # as seen from the above test Our custom Coprocessor has been successfully deployed to the distributed system.

It is important to note that the Coprocessor configured in hbase-site.xml is loaded by every region in each table by default. If you only want a table to use this observer coprocessor, you need to use the table descriptor loading method, which also requires copying the jar package to the lib directory under the hbase installation directory, unlike the above, instead of setting Coprocessor in hbase-site.xml, you use alter to bind Coprocessor to the table. As follows

# # cp jar package to the lib directory under the hbase installation directory, do this on all nodes [grid@hadoop1 ~] $cp / var/ftp/pub/RegionObserverExample.jar / opt/hbase-0.98.4-hadoop2/lib/## comment out the configuration of Coprocessor in hbase-site.xml [grid@hadoop1 ~] $tail-7 / opt/hbase-0.98.4-hadoop2/conf/hbase-site.xml

Use the alter command in hbase shell to set Coprocessor for the kora table

# # format: [coprocessor jar file location] | class name | [priority] | [arguments] # # column child: hbase > alter 't1recording method # 'coprocessor'= >' hdfs:///foo.jar | com.foo.FooRegionObserver | 1001 | arg1=1,arg2=2'## can ignore jar file location because classpath must be set This is as follows: hbase (main): 101kora',hbase 0 > alter 'kora',hbase (main): 102kora',hbase 0*' coprocessor' = >'| org.apache.hbase.kora.coprocessor.RegionObserverExample | 'Updating all regions with the new schema...0/1 regions updated.1/1 regions updated.Done.0 row (s) in 2.5670 seconds

Set successfully, decribe the following table

Hbase (main): 103 kora'DESCRIPTION ENABLED 0 > describe 'kora'DESCRIPTION ENABLED' kora' {TABLE_ATTRIBUTES = > {coprocessor$1 = >'| org.apache.hbase.kora.coprocessor.RegionObserverExa true mple |'}, {NAME = > 'project', DATA_BLOCK_ENCODING = >' NONE', BLOOMFILTER = > 'ROW', REPLICATION_SCOPE = >' 0mm, VERSIONS = > '1mm, COMPRESSION = >' NONE', MIN_VERSIONS = >'0'' TTL = > 'FOREVER', KEEP_DELETED_ CELLS = >' false', BLOCKSIZE = > '65536', IN_MEMORY = > 'false', BLOCKCACHE = >' true'} 1 row (s) in 0.0580 seconds

Ok, which has been set up successfully, let's test it

# # kora Table Specified Coprocessorhbase (main): 104 kora', 0 > get 'kora',' @ @ GETTIME@@@'COLUMN CELL @ GETTIME@@@:@@@GETTIME@@@ timestamp=9223372036854775807 Value=\ X00\ x01I\ xB0\ x985W 1 row (s) in 0.0360 seconds## testtable Coprocessorhbase (main): 105testtable', 0 > get 'testtable',' @ GETTIME@@@'COLUMN CELL 0 row (s) in 0.0180 seconds is not specified

It should be noted that Coprocessor has two priorities: SYSTEM and USER, and SYSTEM takes precedence over USER loading. When setting Coprocessor using a table descriptor, do not set the priority item, otherwise it will not be triggered successfully, as shown in

'coprocessor' = >' | org.apache.hbase.kora.coprocessor.RegionObserverExample | USER |'

Although Coprocessor can be successfully set up, it cannot be triggered during testing, which has been tested in person in the above environment, and the priority item in the hbase help document is not a required input, only the class name is a required input. We can choose which way to configure our Coprocessor according to our own needs.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.