Installation and use of HBase 07/13 Update SLTechnology News&Howtos

Installation and use of HBase

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how to install and use HBase". Friends who are interested may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to install and use HBase.

1 basic introduction to Hbase

Hbase is a distributed database that can provide real-time random read and write of data.

Different from mysql, oralce, db2, sqlserver and other relational databases, Hbase is a NoSQL database (non-relational database) and has the following characteristics:

The table model of Hbase is different from that of relational databases:

Hbase's table does not have a fixed field definition

Each row in Hbase's table stores some key-value pairs

The Hbase table has the division of column families, and you can specify which kv is inserted into which column families.

In physical storage, Hbase tables are divided according to column families, and the data of different column families must be stored in different files.

Each row in the Hbase table has a fixed row key, and the row key for each row cannot be repeated in the table.

The data in Hbase, including row keys, key, and value, are all of byte [] type. Hbase is not responsible for maintaining data types for users.

Hbase has poor support for transactions.

The characteristics of HBASE compared with other nosql databases (mongodb, redis, cassendra, hazelcast): because the table data of Hbase is stored in the HDFS file system, the storage capacity can be expanded linearly; the security and reliability of data storage is extremely high!

Table structure of 2 Hbase rowkey: row key base_infoextra_info001name:zs,age:22,sex:malehobbiy:read,addr:beijing002name:laowang,sex:male

The table model of hbase is very different from that of relational databases such as mysql.

Hbase's table model has the concept of rows, but not the concept of fields.

What is stored in the row is the key-value pair, and the key in the key-value pair in each line can be various.

The main points of hbase Table Model

A table with a table name

A table can be divided into multiple column families (data from different column families are stored in different files)

Each row in the table has a "row key rowkey", and the row key cannot be repeated in the table

Each pair of key-value in the table is called a cell

Hbase can store multiple historical versions of data (the number of historical versions is configurable). The latest version is selected by default.

Due to the large amount of data, the whole table will be horizontally divided into several region (identified by the rowkey range), and the data of different region will also be stored in different files.

Hbase stores the inserted data sequentially:

Sort by row key first

Kv in the same row is sorted by column family and then by k

Table data type of hbase:

Only byte [] is supported in hbase, where byte [] includes: rowkey,key,value, column family name, table name. The table is divided into different region.

3 working mechanism of Hbase

[failed to upload picture... (image-ec30fc-1561887883664)]

The Hbase distributed system consists of two roles

Administrative role: HMaster (usually 2 sets, 1 active, 1 standby)

Data node role: HRegionServer (multiple, with datanode)

If Hbase does not do data processing, there is no need for yarn,yarn to copy Mapreduce calculation. Hbase is only responsible for data management.

4 Hbase installation 4.1 installation preparation

First of all, you need to have a HDFS cluster and run properly; the regionserver of Hbase should be with datanode in hdfs. Secondly, you need a zookeeper cluster and run normally, so you need to install zookeeper,zookeeper before you install Hbase. Then, install Hbase

4.2 Node arrangement

The roles of each node are assigned as follows:

Service installed by the node Masternamenode datanode regionserver hmaster zookeeperSlave01datanode regionserver zookeeperSlave02datanode regionserver zookeeper4.3 installation Hbase

Extract the hbase installation package hbase-2.0.5-bin.tar.gz

Modify hbase-env.sh

Export JAVA_HOME=/usr/local/bigdata/java/jdk1.8.0_211# does not start the zookeeper that comes with hbase. We have installed export HBASE_MANAGES_ZK=false ourselves.

Modify hbase-site.xml

Hbase.rootdir hdfs://Master:9000/hbase hbase.cluster.distributed true hbase.zookeeper.quorum Master:2181,Slave01:2181,Slave02:2181

Modify regionservers

MasterSlave01Slave02

When the modification is complete, place the installation folder in the / usr/local/bigdata/ directory of the three nodes

6 start the Hbase cluster

First check whether hdfs and zookeeper are started properly, Master:

Hadoop@Master:~$ jps4918 DataNode2744 QuorumPeerMain4748 NameNode9949 Jps5167 SecondaryNameNodehadoop@Master:~$ / usr/local/bigdata/zookeeper-3.4.6/bin/zkServer.sh statusJMX enabled by defaultUsing config: / usr/local/bigdata/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower

Slave01:

Hadoop@Slave1:~$ jps3235 QuorumPeerMain3779 DataNode5546 Jpshadoop@Slave1:~$ / usr/local/bigdata/zookeeper-3.4.6/bin/zkServer.sh statusJMX enabled by defaultUsing config: / usr/local/bigdata/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: leader

Slave02:

Hadoop@Slave2:~$ jps11958 DataNode13656 Jps11390 QuorumPeerMainhadoop@Slave2:~$ / usr/local/bigdata/zookeeper-3.4.6/bin/zkServer.sh statusJMX enabled by defaultUsing config: / usr/local/bigdata/zookeeper-3.4.6/bin/../conf/zoo.cfgMode: follower

Then execute start-hbase.sh

$bin/start-hbase.sh

The above command starts all the machines added in the configuration file regionserver. If you want to start one of them manually, you can use:

$bin/hbase-daemon.sh start regionserver

After startup, two services, HRegionServer and HMaster, are started on Master, and Slave01 and Slave02 start the HMaster service.

A highly available Hbase cluster should be configured with two master, one in active state and the other in standby state, which is used to monitor regionserver.

You can start another HRegionServer service from the other two machines.

$bin/hbase-daemon.sh start master

The newly launched master will be in the backup state

7 start the command line client of Hbase

Use the command hbase shell

Bin/hbase shellHbase > list / / View table Hbase > status / / View cluster status Hbase > version / / View cluster version problems ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet at org.apache.hadoop.hbase.master.HMaster.checkServiceStarted (HMaster.java:2932) at org.apache.hadoop.hbase.master.MasterRpcServices.isMasterRunning (MasterRpcServices.java:1084) at org.apache.hadoop.hbase. Shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod (MasterProtos.java) at org.apache.hadoop.hbase.ipc.RpcServer.call (RpcServer.java:413) at org.apache.hadoop.hbase.ipc.CallRunner.run (CallRunner.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run (RpcExecutor.java:324) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run (RpcExecutor.java:304) solves $hdfs dfsadmin- Safemode leave8 Hbase command line client operation 8.1 create table create'tnotify username info' 'base_info','extra_info' table name column family name column family name 8.2 insert data: hbase (main): 011 main 0 > put'tresume username infocharge memorial01Zongsanqian 0 row (s) in 0.2420 family name (main): 012main 0 > put'tresume username infoetiation gamma' '1813 row (s) in 0.0140 secondshbase (main): 013 in 0 > put'tregistered username infores in: 01mpl Baseball inforegation sexuality row (s) in 0.0070 secondshbase (main): 014frog 0 > put'tsimilar username inforegation row (s) in 0.0090 secondshbase (main): 015rig 0 > put'twatch userinfoc parade 002 'actoress'0 row (s) in 0.0090 secondshbase (main): 016 secondshbase 0 > put' t coach username 'liuyifei'0 row (s) in 0.0060 seconds8.3 query method 1: scan scan hbase (main): 017seconds8.3 0 > scan' t_user_info'ROW COLUMN+CELL 001 column=base_info:age, timestamp=1496567924507 Value=18 001 column=base_info:sex, timestamp=1496567934669, value=female 001 column=base_info:username, timestamp=1496567889554, value=zhangsan 001 column=extra_info:career, timestamp=1496567963992 Value=it 002 column=base_info:username, timestamp=1496568034187, value=liuyifei 002 column=extra_info:career, timestamp=1496568008631 Value=actoress 2 row (s) in 0.0420 seconds8.4 query method 2: get single row data hbase (main): 020seconds8.4 0 > get 'tasking usernames '001'COLUMN CELL base_info:age timestamp=1496568160192, value=19 base_info:sex timestamp=1496567934669 Value=female base_info:username timestamp=1496567889554, value=zhangsan extra_info:career timestamp=1496567963992 Value=it 4 row (s) in 0.0770 seconds8.5 Delete a kv data hbase (main): 021 hbase 0 > delete'tresume usernames infofores hbase (main): 0240 seconds > deleteall'tresume usernames infofos' '001mm 0 row (s) in 0.0090 secondshbase (main): 025 CELL 0 > get'tcolumn CELL 0 row (s) in 0.0110 seconds3.4.1.6. Delete the entire table: hbase (main): 028 t_user_info'0 row (s) in 2.3640 secondshbase (main): 029 t_user_info'0 row (s) in 1.2950 secondshbase (main): 030 secondshbase 0 > listTABLE 0 row (s) in 0.0130 seconds= > [] 8.6 Hbase important feature-sorting feature (row key)

Hbase automatically sorts and stores the data inserted into the hbase: collation: first look at the row keys, then the column family names, and then the column (key) names; in dictionary order

This feature of Hbase has a lot to do with query efficiency.

For example: a table used to store user information, with name, household registration, age, occupation. Then, it is often needed in the business system: to query all users of a province, it is often necessary to query all users of a province with a specified surname.

Idea: if users of the same province can be continuously stored in hbase storage files, and users of the same surname in the same province can be stored continuously, then the efficiency of the above two query requirements will be improved!

Practice: spell the query conditions into the rowkey

9 HBASE client API Operation 9.1 DDL Operation

Code flow:

Create a connection: Connection conn = ConnectionFactory.createConnection (conf)

Get a DDL operator: table manager: adminAdmin admin = conn.getAdmin ()

Use the api of the table manager to build, delete, and modify table definitions: admin.createTable (HTableDescriptor descriptor)

@ Beforepublic void getConn () throws Exception {/ / build a connection object Configuration conf = HBaseConfiguration.create (); / / automatically load hbase-site.xml conf.set ("hbase.zookeeper.quorum", "192.168.233.200 Configuration conf 2181192.168.233.201 HBaseConfiguration.create 2181"); conn = ConnectionFactory.createConnection (conf) } / * DDL * @ throws Exception * / @ Testpublic void testCreateTable () throws Exception {/ / construct a DDL operator Admin admin = conn.getAdmin () from the connection; / / create a table definition description object HTableDescriptor hTableDescriptor = new HTableDescriptor ("user_info") / / create a column family definition description object HColumnDescriptor hColumnDescriptor_1 = new HColumnDescriptor ("base_info"); hColumnDescriptor_1.setMaxVersions (3); / / set the maximum number of versions of data stored in the column family. The default is 1 HColumnDescriptor hColumnDescriptor_2 = new HColumnDescriptor ("extra_info"). / / put the column family definition information object into the table definition object hTableDescriptor.addFamily (hColumnDescriptor_1); hTableDescriptor.addFamily (hColumnDescriptor_2); / / create the table admin.createTable (hTableDescriptor) with the ddl operator object: admin; / / close the connection admin.close (); conn.close () } / * delete table * @ throws Exception * / @ Testpublic void testDropTable () throws Exception {Admin admin = conn.getAdmin (); / / disable table admin.disableTable (TableName.valueOf ("user_info")); / / delete table admin.deleteTable (TableName.valueOf ("user_info")); admin.close () Conn.close ();} / / modify the table definition-add a column family @ Testpublic void testAlterTable () throws Exception {Admin admin = conn.getAdmin (); / / fetch the old table definition information HTableDescriptor tableDescriptor = admin.getTableDescriptor (TableName.valueOf ("user_info")) / / construct a new column family definition HColumnDescriptor hColumnDescriptor = new HColumnDescriptor ("other_info"); hColumnDescriptor.setBloomFilterType (BloomType.ROWCOL); / / set the Bloom filter type of the column family / / add the column family definition to the table definition object tableDescriptor.addFamily (hColumnDescriptor) / / send the modified table definition to admin to submit admin.modifyTable (TableName.valueOf ("user_info"), tableDescriptor); admin.close (); conn.close ();} 9.2 DML operation

Addition, deletion, modification and search of HBase

Connection conn = null; @ Before public void getConn () throws Exception {/ / build a connection object Configuration conf = HBaseConfiguration.create (); / / automatically load hbase-site.xml conf.set ("hbase.zookeeper.quorum", "Master:2181,Slave01:2181,Slave02:2181") Conn = ConnectionFactory.createConnection (conf) } / * * add * change: put to overwrite * @ throws Exception * / @ Test public void testPut () throws Exception {/ / get a table object that operates on the specified table Perform the DML operation Table table = conn.getTable (TableName.valueOf ("user_info")) / / construct an object of type Put (a put object can only correspond to one rowkey) Put put = new Put (Bytes.toBytes ("001"); put.addColumn (Bytes.toBytes ("base_info"), Bytes.toBytes ("username"), Bytes.toBytes ("Zhang San")) Put.addColumn (Bytes.toBytes ("base_info"), Bytes.toBytes ("age"), Bytes.toBytes ("18"); put.addColumn (Bytes.toBytes ("extra_info"), Bytes.toBytes ("addr"), Bytes.toBytes ("Beijing"); Put put2 = new Put (Bytes.toBytes ("002")) Put2.addColumn (Bytes.toBytes ("base_info"), Bytes.toBytes ("username"), Bytes.toBytes ("Li Si"); put2.addColumn (Bytes.toBytes ("base_info"), Bytes.toBytes ("age"), Bytes.toBytes ("28"); put2.addColumn (Bytes.toBytes ("extra_info"), Bytes.toBytes ("addr"), Bytes.toBytes ("Shanghai")) ArrayList puts = new ArrayList (); puts.add (put); puts.add (put2); / / insert table.put (puts); table.close () Conn.close ();} / * insert a large amount of data * @ throws Exception * / @ Test public void testManyPuts () throws Exception {Table table = conn.getTable (TableName.valueOf ("user_info")) ArrayList puts = new ArrayList (); for (int item0)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.