Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Hbase summary of good programmer big data's learning route

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Good programmer big data learning route Hbase summed up, why there is hbase

With the gradual increase of data, the traditional relational database can not meet the query and storage of data. Hive is not a database, but a data warehouse. Although it can meet the simple storage requirements, it can never meet the storage and query of unstructured and semi-structured data.

What is 2hbase?

Hbase is an open source, multi-version, extensible non-relational database under Apache.

He is based on Google's bigtable, a nosql database system based on hdfs that provides highly reliable, high-performance column storage, scalable, real-time read and write.

3 applicable scenarios

Massive data storage

Random real-time reading, writing and managing data

4 characteristics

Column storage

Schema: no schema (tables can be created directly without use gp1808, so tables in hbase cannot be renamed)

Data type: single byte []

Multiple versions (version): each value can have multiple versions

Sparse storage: if kv is null, no storage space is used

5 structural framework

Client:

The client of hbase, which contains the interface of the accessed hbase (linux shell, java api)

Maintain some cache (cache) to speed up the hbase, such as the location information of region

Zookeeper:

Monitor the status of hmaster to ensure that there is one and only one active hmaster to achieve high availability

Address entry to store all region

Monitor the status of hregionserver in real time and inform hmaster of the online and offline information of regionserrver in real time.

Stores information for all tables in hbase (metadata for hbase)

Hmster (the boss of hbase)

Assign region to regionserver (create a new table)

Responsible for load balancing of regionserver

Responsible for region reallocation (handling hregionserver exceptions, hregion fission)

Garbage file collection on Hdfs

Process the update request for schema

Hregionserver (hbase's little brother)

Maintain the region given to him by the boss (manage the region on this computer)

Handle client's IO request to region and interact with hdfs

Regionserver is responsible for shredding the region that becomes larger during operation.

Hregion:

In Hbase, the minimum unit of distributed storage and load, a table or part of a table

(in HBase, data is sorted by primary key, while tables are divided into multiple Region by primary key

Region is segmented by size. As the data increases, the Region increases. When the Region increases to a threshold, it is divided into two new Region).

Although Region is the smallest unit of distributed storage, it is not the smallest unit of storage. Each Region contains multiple Store objects. Each Store contains one MemStore or several StoreFile,StoreFile contains one or more HFile. MemStore is stored in memory and StoreFile is stored on HDFS.

Hlog

Record the operation of hbase, use wal (Write-Ahead-Log) to write data, write log first, and then write memstore, in case of data loss, you can roll back the recovery data.

Store:

It is equivalent to a cluster of columns.

Memstore:128M

Memory buffer, which is used to flush data to the hdfs in bulk

Hstorefile (hfile)

The data in Hbase is stored on hdfs as hfile.

Write process:

1client sends a write data request to regionserver through the scheduling of zookeeper, and writes data in region

2 data is written to the memstore of region until memstore reaches the preset threshold (128m).

The data in 3memstore is flush into a storefile

4 with the continuous increase of storefile files, when the number of compact files increases to a certain threshold, the compact merge operation is triggered to merge multiple storefile into a single storefile, and to merge versions and delete data at the same time.

Through continuous compact merge operations, 5storefiles gradually forms a larger and larger storefile.

6 when the size of a single storefile exceeds a certain threshold, the spilt operation is triggered to split the current region into two new region, the parent region will go offline, and the newly cut two-word region will be assigned to the corresponding regionserver by hmaster, so that the pressure of the original region is distributed to the two region

Reading process:

1client visits zookeeper, looks up the root table, and gets the information of the meta table

2 find the region information of the stored target from the meta table, so as to find the corresponding regionserver

3 obtain the data to be queried through regionserver

The memory of 4regionserver is divided into two parts: memstore and blockcache. Memstore is mainly used to write data, and blockcache is mainly used to read data. If you request to check the data in memstore first, if you can't find it, you will check it in blockcache. If you can't find it, you will read it on storefile, and put the reading result in blockcache.

Addressing process: client-- > Zookeeper-- >-ROOT- table-- > .META. Table-- > RegionServer-- > Region-- > client

Rowkey: row keys, like the primary keys of mysql, are not allowed to repeat and are arranged in dictionary order

Columnfamily: column cluster

Column: column

Timestamp: timestamp. The latest timestamp is displayed by default.

Version: version number, the version of the recorded data

Cell: cell, one key and one value

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report