Hadoop HBase 07/19 Update SLTechnology News&Howtos

Hadoop HBase

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

I. Overview:

1. Definition: HBase is an open source copycat version of Google Bigtable. It is a database system based on HDFS, which provides high reliability, high performance, column storage, scalability, real-time, random read and write.

It is between nosql and RDBMS, can only retrieve data through primary key (row key) and primary key range, and only supports single-row transactions (complex operations such as multi-table join can be realized through hive support). It is mainly used to store unstructured and semi-structured loose data. Like hadoop, the Hbase goal relies mainly on scale-out, increasing computing and storage capacity by increasing the number of cheap commercial servers.

2. Features:

Tables in HBase generally have the following characteristics:

(1), large: a table can have hundreds of millions of rows and millions of columns

(2) column-oriented: column-oriented storage and permission control, column (family) independent retrieval.

(3) sparse: columns that are null do not take up storage space, so tables can be designed to be very sparse.

2. Hbase command line:

1. Enter the hbase command line. / hbase shell

2. Display the table list in hbase

3. Create a user table that contains two column families, info and data

Create 'user', {NAME = >' info', VERSIONS = >'3'}, {NAME = > 'data'}

4. Insert information into the user table:

(1) insert row key as rk0001, add name column identifier to the column family info, and the value is zhangsan

Put 'user',' rk0001', 'info:name',' zhangsan'

(2) insert row key as rk0001, add gender column identifier to the column family info, and the value is female

Put 'user',' rk0001', 'info:gender',' female'

(3) insert row key as rk0001, add age column identifier to the column family info, with a value of 20

Put 'user',' rk0001', 'info:age', 20

(4) insert row key as rk0001, add pic column identifier to the column family data, and the value is picture

Put 'user',' rk0001', 'data:pic',' picture'

5. Get acquires data:

(1) get all the information in the user table where row key is rk0001

Get 'user',' rk0001'

(2) get all the information that row key is the rk0001,info column family in the user table

Get 'user',' rk0001', 'info'

(3) get the information of name and age column identifiers in the user table where row key is the rk0001,info column family.

Get 'user',' rk0001', 'info:name',' info:age'

(4) get the information that row key is rk0001,info and data column family in user table.

Get 'user',' rk0001', 'info',' data'

Get 'user',' rk0001', {COLUMN = > ['info',' data']}

(5) get the information that the row key is rk0001, the column family is info and the version number is the latest 5 in the user table

Get 'user',' rk0001', {COLUMN = > 'info:name', VERSIONS = > 5}

6. Scan acquires data:

(1) query all the information in the user table

Scan 'user'

(2) query the row key that begins with the rk character in the user table

Scan 'user', {FILTER= > "PrefixFilter (' rk')"}

(3) query the data in the user table whose column family is info,rk and whose range is [rk0001, rk0003).

Scan 'people', {COLUMNS = >' info', STARTROW = > 'rk0001', ENDROW = >' rk0003'}

(4) query the information in the user table that the column families are info and data and the column identifier contains the a character

Scan 'user', {COLUMNS = > [' info', 'data'], FILTER = > "(QualifierFilter (=,' substring:a'))"}

(5) query the data in the specified range in the user table

Scan 'user', {TIMERANGE = > [1392368783980,1392380169184]}

7. Delete data

(1) Delete the data of user table whose row key is rk0001 and column identifier is info:name

Delete 'user',' rk0001', 'info:name'

Delete the data whose row key is rk0001 and column identifier info:name,timestamp is 1392383705316 in user table.

Delete 'user',' rk0001', 'info:name', 1392383705316

8. Delete the table

Disable 'user'

Drop 'user'

3. Java api of HBase:

Import java.util.List

Import org.apache.hadoop.conf.Configuration

Import org.apache.hadoop.hbase.Cell

Import org.apache.hadoop.hbase.HBaseConfiguration

Import org.apache.hadoop.hbase.HColumnDescriptor

Import org.apache.hadoop.hbase.HTableDescriptor

Import org.apache.hadoop.hbase.KeyValue

Import org.apache.hadoop.hbase.TableName

Import org.apache.hadoop.hbase.client.Delete

Import org.apache.hadoop.hbase.client.Get

Import org.apache.hadoop.hbase.client.HBaseAdmin

Import org.apache.hadoop.hbase.client.HTable

Import org.apache.hadoop.hbase.client.Put

Import org.apache.hadoop.hbase.client.Result

Import org.apache.hadoop.hbase.client.ResultScanner

Import org.apache.hadoop.hbase.client.Scan

Import org.apache.hadoop.hbase.filter.BinaryComparator

Import org.apache.hadoop.hbase.filter.BinaryPrefixComparator

Import org.apache.hadoop.hbase.filter.ByteArrayComparable

Import org.apache.hadoop.hbase.filter.ColumnPrefixFilter

Import org.apache.hadoop.hbase.filter.CompareFilter.CompareOp

Import org.apache.hadoop.hbase.filter.FamilyFilter

Import org.apache.hadoop.hbase.filter.Filter

Import org.apache.hadoop.hbase.filter.MultipleColumnPrefixFilter

Import org.apache.hadoop.hbase.filter.PrefixFilter

Import org.apache.hadoop.hbase.filter.QualifierFilter

Import org.apache.hadoop.hbase.filter.RegexStringComparator

Import org.apache.hadoop.hbase.filter.RowFilter

Import org.apache.hadoop.hbase.filter.SingleColumnValueFilter

Import org.apache.hadoop.hbase.filter.SubstringComparator

Import org.apache.hadoop.hbase.master.TableNamespaceManager

Import org.apache.hadoop.hbase.util.Bytes

Import org.junit.Before

Import org.junit.Test

Public class HbaseDemo {

Private Configuration conf = null

@ Before

Public void init () {

Conf = HBaseConfiguration.create ()

Conf.set ("hbase.zookeeper.quorum", "node1,node2,node3")

}

@ Test

Public void testDrop () throws Exception {

HBaseAdmin admin = new HBaseAdmin (conf)

Admin.disableTable ("account")

Admin.deleteTable ("account")

Admin.close ()

}

@ Test

Public void testPut () throws Exception {

HTable table = new HTable (conf, "person_info")

Put p = new Put (Bytes.toBytes ("person_rk_bj_zhang_000002"))

P.add ("base_info" .getBytes (), "name" .getBytes (), "zhangwuji" .getBytes ())

Table.put (p)

Table.close ()

}

@ Test

Public void testGet () throws Exception {

HTable table = new HTable (conf, "person_info")

Get get = new Get (Bytes.toBytes ("person_rk_bj_zhang_000001"))

Get.setMaxVersions (5)

Result result = table.get (get)

List cells = result.listCells ()

/ / result.getValue (family, qualifier); you can directly extract a specific value from the result

/ / traverses all the key-value pairs in result

For (KeyValue kv: result.list ()) {

String family = new String (kv.getFamily ())

System.out.println (family)

String qualifier = new String (kv.getQualifier ())

System.out.println (qualifier)

System.out.println (new String (kv.getValue ()

}

Table.close ()

}

/ * *

* use of a variety of filter conditions

* @ throws Exception

, /

@ Test

Public void testScan () throws Exception {

HTable table = new HTable (conf, "person_info" .getBytes ())

Scan scan = new Scan (Bytes.toBytes ("person_rk_bj_zhang_000001"), Bytes.toBytes ("person_rk_bj_zhang_000002"))

/ / prefix filter-for row keys

Filter filter = new PrefixFilter (Bytes.toBytes ("rk"))

/ / Line filter

ByteArrayComparable rowComparator = new BinaryComparator (Bytes.toBytes ("person_rk_bj_zhang_000001"))

RowFilter rf = new RowFilter (CompareOp.LESS_OR_EQUAL, rowComparator)

/ * *

* assume that the rowkey format is: creation date _ release date _ ID_TITLE

* goal: find data with a release date of 2014-12-21

, /

Rf = new RowFilter (CompareOp.EQUAL, new SubstringComparator ("_ 2014-12-21 _"))

/ / single-valued filter 1 fully matches byte array

New SingleColumnValueFilter ("base_info" .getBytes (), "name" .getBytes (), CompareOp.EQUAL, "zhangsan" .getBytes ())

/ / single-valued filter 2 matches regular expression

ByteArrayComparable comparator = new RegexStringComparator ("zhang.")

New SingleColumnValueFilter ("info" .getBytes (), "NAME" .getBytes (), CompareOp.EQUAL, comparator)

/ / single-valued filter 2 matches whether it contains substrings and is case-insensitive.

Comparator = new SubstringComparator ("wu")

New SingleColumnValueFilter ("info" .getBytes (), "NAME" .getBytes (), CompareOp.EQUAL, comparator)

/ / key-value pair metadata filtering-family filtering-complete matching of byte array

FamilyFilter ff = new FamilyFilter (

CompareOp.EQUAL

There is no inf column family in the new BinaryComparator (Bytes.toBytes ("base_info")) / / table, and the filter result is empty.

);

/ / key-value pair metadata filtering-family filtering-byte array prefix matching

Ff = new FamilyFilter (

CompareOp.EQUAL

New BinaryPrefixComparator (Bytes.toBytes ("inf")) / / there is a column family info that starts with inf in the table, and the filtering result is all rows of that column family.

);

/ / key-value pair metadata filtering-qualifier filtering-complete matching of byte array

Filter = new QualifierFilter (

CompareOp.EQUAL

There is no na column in the new BinaryComparator (Bytes.toBytes ("na")) / / table, and the filter result is empty.

);

Filter = new QualifierFilter (

CompareOp.EQUAL

New BinaryPrefixComparator (Bytes.toBytes ("na")) / / there is a column name that starts with na in the table, and the filter result is the column data of all rows.

);

/ / ColumnPrefixFilter that filters data based on the column name (that is, Qualifier) prefix

Filter = new ColumnPrefixFilter ("na" .getBytes ())

/ / MultipleColumnPrefixFilter that filters data based on column names (i.e., Qualifier) with multiple prefixes

Byte [] [] prefixes = new byte [] [] {Bytes.toBytes ("na"), Bytes.toBytes ("me")}

Filter = new MultipleColumnPrefixFilter (prefixes)

/ / set filter criteria for the query

Scan.setFilter (filter)

Scan.addFamily (Bytes.toBytes ("base_info"))

ResultScanner scanner = table.getScanner (scan)

For (Result r: scanner) {

/ * *

For (KeyValue kv: r.list ()) {

String family = new String (kv.getFamily ())

System.out.println (family)

String qualifier = new String (kv.getQualifier ())

System.out.println (qualifier)

System.out.println (new String (kv.getValue ()

}

, /

/ / get a specific value directly from result

Byte [] value = r.getValue (Bytes.toBytes ("base_info"), Bytes.toBytes ("name"))

System.out.println (new String (value))

}

Table.close ()

}

@ Test

Public void testDel () throws Exception {

HTable table = new HTable (conf, "user")

Delete del = new Delete (Bytes.toBytes ("rk0001"))

Del.deleteColumn (Bytes.toBytes ("data"), Bytes.toBytes ("pic"))

Table.delete (del)

Table.close ()

}

Public static void main (String [] args) throws Exception {

Configuration conf = HBaseConfiguration.create ()

/ / conf.set ("hbase.zookeeper.quorum", "weekend05:2181,weekend06:2181,weekend07:2181")

HBaseAdmin admin = new HBaseAdmin (conf)

TableName tableName = TableName.valueOf ("person_info")

HTableDescriptor td = new HTableDescriptor (tableName)

HColumnDescriptor cd = new HColumnDescriptor ("base_info")

Cd.setMaxVersions (10)

Td.addFamily (cd)

Admin.createTable (td)

Admin.close ()

}

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.