In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly talks about "what is the data model of HBase table". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what is the data model of HBase table"?
HBase is a database running on a Hadoop cluster. Unlike traditional databases, which have strict ACID (atomicity, consistency, isolation, persistence) requirements, HBase reduces these requirements to obtain better scalability. It is more suitable for storing some unstructured and semi-structured data.
Table (Table)
The data in HBase is stored as a table. The data in the same table is usually relevant, and the main purpose of using the table is to organize some columns to be accessed together. Table names are used as part of the HDFS storage path, and you can see each table name as a separate directory structure in HDFS.
The main concepts of the data model of HBase table include rowkey, Column Family, Column, cell, Timestamp.
1. Rowkey row key
Primary key of table, records in table = = sort according to rowkey dictionary order = =
The rowkey line key can be any string (the maximum length is 64KB, and the length is usually 10-100bytes in practical applications)
2. Column Family column family
Is called a column family or column cluster.
Each column in the HBase table belongs to a column family
Column families are part of the schema of the table (while columns are not), that is, at least one column family is specified when the table is created
For example, create a table named user with two column families, userInfo and addressInfo, with the table statement create 'user',' userInfo', 'addressInfo'
3. Column column
A column must be a column under a column family of a table, represented by the column family name: column name, such as the name column under the userInfo column family, expressed as userInfo:name
It belongs to a ColumnFamily, similar to the specific column created in our mysql
4. Cell cell
Know row key row keys, column families, columns, can be determined by a cell cell
The data in cell is typeless and is all stored as a byte array
5. Timestamp timestamp
The Cell in the table can be assigned many times. The timestamp timestamp of each assignment operation can be regarded as the version number version number of the Cell value.
That is, a Cell can have multiple versions of the value.
Understand the diagrams of the concepts of the data model
The above table shows the user information table user in HBase, which has three rows of records and two column families (without considering the blank column families, which means there can be many column families), the row keys are 1, 2 and 3 respectively, and the two column families are userInfo and addressInfo respectively, each column family contains several columns, such as column family userInfo includes name, age, sex 3 columns, column family addressInfo includes address, from, phone, email, ralary 5 columns.
In HBase, columns are not fixed table structures, and you don't need to define column names in advance when you create a table, but you can create them temporarily when you insert data.
From the logical model of the table, there seems to be no difference between the HBase table and the table structure in the relational database, except that there is the concept of column family. But in fact, there is a great difference. The structure of tables in relational databases needs to be defined in advance, such as column names and their data types and ranges.
When you need to add new columns, you need to modify the table structure, which will have a great impact on the existing data. At the same time, the tables in the relational database reserve storage space for each column, that is, the blank Cell data in the table occupies the storage space with a "NULL" value in the relational database. Therefore, for sparse data, relational database tables will produce a lot of "NULL" values, consuming a lot of storage space.
Different from the row-oriented relational database, HBase is column-oriented, and in actual physical storage, column families are stored separately, that is, the user information table in the table will be stored as userInfo and addressInfo.
At the same time, HBase has a timestamp, which can assign values to a cell multiple times, and can store values of multiple versions. For example, the data with a rowkey of 1 in the above table stores 2 time versions of data.
At this point, I believe you have a deeper understanding of "what is the data model of HBase table". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.