In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
The hbase.hregion.max.filesize property value of hbase is used to specify the threshold for region segmentation, which defaults to 268435456 (256MB). When a column family file size exceeds this value, it will be split into two region.
Hbase can have many columns, and there are two ways to choose from at design time, a wide table (a row has many columns) and a narrow table
If you have a table that stores user mail
When designed according to a wide table, it can be expressed as (all messages for a user are saved in a row)
Userid1 email1 emali2 email3... Emailn
Userid2 email1 emali2 email3... Emailn
Useridn
When designed according to a narrow table, it can be expressed as (rowkey consists of ID and emailID)
Userid1_emialid1 email1
Userid1_emialid2 email2
Userid1_emialid3 email2
Userid1_emialidn emailn
Userid2_emialid1 email1
Userid2_emialid2 email2
Userid2_emialid3 email3
Userid2_emialidn emailn
These two design methods will have an impact on the segmentation of region. Today, when looking at the HFileOutputFormat code, we found that the RecordWriter from its new has certain restrictions on region segmentation.
Only when the rowkey is different will it be split, and when the rowkey is the same, it will not be split even if the region size has exceeded the hbase.hregion.max.filesize value.
RecordWriter Code:
Public void write (ImmutableBytesWritable row, KeyValue kv) throws IOException {long length = kv.getLength (); byte [] family = kv.getFamily (); WriterLength wl = this.writers.get (family); if (wl = = null | (length + wl.written) > = maxsize) & Bytes.compareTo (this.previousRow, 0, this.previousRow.length,kv.getBuffer (), kv.getRowOffset (), kv.getRowLength ()! = 0) {/ / Get a new writer.Path basedir = new Path (outputdir, Bytes.toString (family)) If (wl = = null) {wl = new WriterLength (); this.writers.put (family, wl); if (this.writers.size () > 1) throw new IOException ("One family only"); / / If wl = = null, first file in family. Ensure family dir exits.if (! fs.exists (basedir)) fs.mkdirs (basedir);} wl.writer = getNewWriter (wl.writer, basedir); LOG.info ("Writer=" + wl.writer.getPath () + ((wl.written = = 0)? ", wrote=" + wl.written)); wl.written = 0;} kv.updateLatestStamp (this.now); wl.writer.append (kv); wl.written + = length;// Copy the row so we know when a row transition.this.previousRow = kv.getRow ();}
The bold red section indicates that the region will be split only when the block size is greater than the hbase.hregion.max.filesize value, but the current row is different from the last inserted row.
1. In the case of a wide table, the size of a single row exceeds the hbase.hregion.max.filesize value and will not be split.
two。 Many different versions of records are inserted under the same rowkey, even if the size exceeds the hbase.hregion.max.filesize value, it will not be split.
Let's verify this:
To see the effect as soon as possible, you need to modify two configuration parameters in hbase-site.xml
Hbase.hregion.memstore.flush.size5Memstore will be flushed to disk if size of the memstoreexceeds this number of bytes. Value is checked by a thread that runsevery hbase.server.thread.wakefrequency.hbase.hregion.max.filesize10Maximum HStoreFile size. If any one of a column families' HStoreFiles hasgrown to exceed this value, the hosting HRegion is split in two.Default: 256M.
Build test tables T1 and T2
Hbase (main): 076 row 0 * create 't1 row (s) in 1.6460 secondshbase (main): 077 seconds 0 > create 't2 coach in 1.1790 seconds
View the system table. Meta.
Hbase (main): 081 META.'ROW COLUMN+CELLt1,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:regioninfo 0* scan'. META.'ROW COLUMN+CELLt1,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:regioninfo, timestamp=1314720667384, value=REGION = > {NAME = >'T1 Magi 1314720667274.d8acd6bc659ac8326b88850d645a90ad.Canada, STARTKEY = >', ENDK. EY = >'', ENCODED = > d8acd6bc659ac8326b88850d645a90ad, TABLE = > {{NAME = >'T1, FAMILIES = > [{NAME = > F1, BLOOMFILTER = > 'NONE', REPLICATION_SCOPE= >' 0, COMPRESSION = > 'NONE', VERSIONS = >' 3, TTL = > '2147483647, BLOCKSIZE = >' 65536, IN_MEMORY = > 'false', BLOCKCACHE = >' true'}]}} t1 parry 1314720667274.d8acd6bc659ac826b88850d645a90ad column=info:server, timestamp=1314720667941, value=yinjie:60020.t1,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:serverstartcode, timestamp=1314720667941, value=1314716290123.t2,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:regioninfo, timestamp=1314720672241 Value=REGION = > {NAME = > 't2jiggle 1314720672168.16bb3d2563eab3b4e25477c64e007e71.discipline, STARTKEY = >', ENDK. EY = >'', ENCODED = > 16bb3d2563eab3b4e25477c64e007e71, TABLE = > {{NAME = >'T2, FAMILIES = > [{NAME = >'F1, BLOOMFILTER = > 'NONE', REPLICATION_SCOPE= >' 0, COMPRESSION = > 'NONE', VERSIONS = >' 3, TTL = > '2147483647, BLOCKSIZE = >' 65536, IN_MEMORY = > 'false', BLOCKCACHE = >' true'}]} t2Paradise 1314720672168.16b3d2563eab3b4e257c64e007e71 column=info:server, timestamp=1314720672346, value=yinjie:60020.t2,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:serverstartcode, timestamp=1314720672346, value=1314716290123.2 row (s) in seconds 0.0230
As you can see, T1 and T2 already have a region.
Insert 10 records into T1 table first, same as rowkwy
Hbase (main): 086 for i in 0.9 do\ hbase (main): 087 hbase 1* put 't1recording grammar, "f1:c# {I}" "swallow# {I}"\ hbase (main): 088 end0 row (s) in 0.0180 seconds0 row (s) in 0.0070 seconds0 row (s) in 0.0420 seconds0 row (s) in 0.0620 seconds0 row (s) in 0.0120 seconds0 row (s) in 0.0770 seconds0 row (s) in 0.0150 seconds0 row (s) in 0.1290 seconds0 row (s) in 10.0740 seconds0 row (s) in 0.1230 seconds= > 0..9hbase (main): 08900 >
View T1 record
Hbase (main): 089 t1'ROW COLUMN+CELLrow1 column=f1:c0 > scan 't1'ROW COLUMN+CELLrow1 column=f1:c0, timestamp=1314720946495, value=swallow0row1 column=f1:c1, timestamp=1314720946507, value=swallow1row1 column=f1:c2, timestamp=1314720946903, value=swallow2row1 column=f1:c3, timestamp=1314720946939, value=swallow3row1 column=f1:c4, timestamp=1314720946976, value=swallow4row1 column=f1:c5, timestamp=1314720947055, value=swallow5row1 column=f1:c6, timestamp=1314720947070, value=swallow6row1 column=f1:c7, timestamp=1314720947198, value=swallow7row1 column=f1:c8, timestamp=1314720957272, value=swallow8row1 column=f1:c9, timestamp=1314720957392, value=swallow91 row (s) in 0.0300 seconds
Check .meta.
Hbase (main): 090 META.'ROW COLUMN+CELLt1,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:regioninfo 0 > scan'. META.'ROW COLUMN+CELLt1,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:regioninfo, timestamp=1314720667384, value=REGION = > {NAME = >'T1 Magi 1314720667274.d8acd6bc659ac8326b88850d645a90ad.Canada, STARTKEY = >', ENDK. EY = >'', ENCODED = > d8acd6bc659ac8326b88850d645a90ad, TABLE = > {{NAME = >'T1, FAMILIES = > [{NAME = > F1, BLOOMFILTER = > 'NONE', REPLICATION_SCOPE= >' 0, COMPRESSION = > 'NONE', VERSIONS = >' 3, TTL = > '2147483647, BLOCKSIZE = >' 65536, IN_MEMORY = > 'false', BLOCKCACHE = >' true'}]}} t1 parry 1314720667274.d8acd6bc659ac826b88850d645a90ad column=info:server, timestamp=1314720667941, value=yinjie:60020.t1,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:serverstartcode, timestamp=1314720667941, value=1314716290123.t2,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:regioninfo, timestamp=1314720672241 Value=REGION = > {NAME = > 't2jiggle 1314720672168.16bb3d2563eab3b4e25477c64e007e71.discipline, STARTKEY = >', ENDK. EY = >'', ENCODED = > 16bb3d2563eab3b4e25477c64e007e71, TABLE = > {{NAME = >'T2, FAMILIES = > [{NAME = >'F1, BLOOMFILTER = > 'NONE', REPLICATION_SCOPE= >' 0, COMPRESSION = > 'NONE', VERSIONS = >' 3, TTL = > '2147483647, BLOCKSIZE = >' 65536, IN_MEMORY = > 'false', BLOCKCACHE = >' true'}]} t2Paradise 1314720672168.16b3d2563eab3b4e257c64e007e71 column=info:server, timestamp=1314720672346, value=yinjie:60020.t2,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:serverstartcode, timestamp=1314720672346, value=1314716290123.2 row (s) in seconds 0.0210
You can see that T1 still has only one region
Next, the T2 table often inserts 10 same records, but the rowkwy is different.
Hbase (main): 091 do 0 > for i in 0.. 9 do\ hbase (main): 092 do 1 * put't 2, "row# {I}", "f1:c# {I}" "swallow# {I}"\ hbase (main): 093 end0 row (s) in 0.1140 seconds0 row (s) in 0.0080 seconds0 row (s) in 0.0410 seconds0 row (s) in 0.0820 seconds0 row (s) in 0.0210 seconds0 row (s) in 0.0410 seconds0 row (s) in 0.0200 seconds0 row (s) in 0.1210 seconds0 row (s) in 0.0140 seconds0 row (s) in 0.0360 seconds= > 0.9
View T2 record
Hbase (main): 097 scan 't2'ROW COLUMN+CELLrow0 column=f1:c0, timestamp=1314721110769, value=swallow0row1 column=f1:c1, timestamp=1314721110787, value=swallow1row2 column=f1:c2, timestamp=1314721110830, value=swallow2row3 column=f1:c3, timestamp=1314721110916, value=swallow3row4 column=f1:c4, timestamp=1314721110932, value=swallow4row5 column=f1:c5, timestamp=1314721110971, value=swallow5row6 column=f1:c6, timestamp=1314721110989, value=swallow6row7 column=f1:c7, timestamp=1314721111121, value=swallow7row8 column=f1:c8, timestamp=1314721111130, value=swallow8row9 column=f1:c9, timestamp=1314721111172, value=swallow910 row (s) in 1.0450 seconds
Check .meta.
Hbase (main): 102 META.'ROW COLUMN+CELLt1,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:regioninfo 0 > scan'. META.'ROW COLUMN+CELLt1,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:regioninfo, timestamp=1314720667384, value=REGION = > {NAME = >'T1 Magi 1314720667274.d8acd6bc659ac8326b88850d645a90ad.Canada, STARTKEY = >', ENDK. EY = >'', ENCODED = > d8acd6bc659ac8326b88850d645a90ad, TABLE = > {{NAME = >'T1, FAMILIES = > [{NAME = > F1, BLOOMFILTER = > 'NONE', REPLICATION_SCOPE= >' 0, COMPRESSION = > 'NONE', VERSIONS = >' 3, TTL = > '2147483647, BLOCKSIZE = >' 65536, IN_MEMORY = > 'false', BLOCKCACHE = >' true'}]}} t1 parry 1314720667274.d8acd6bc659ac826b88850d645a90ad column=info:server, timestamp=1314720667941, value=yinjie:60020.t1,1314720667274.d8acd6bc659ac8326b88850d645a90ad column=info:serverstartcode, timestamp=1314720667941, value=1314716290123.t2,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:regioninfo, timestamp=1314721112130 Value=REGION = > {NAME = > 't2jiggle 1314720672168.16bb3d2563eab3b4e25477c64e007e71.discipline, STARTKEY = >', ENDK. EY = >'', ENCODED = > 16bb3d2563eab3b4e25477c64e007e71, OFFLINE = > true, SPLIT = > true, TABLE = > {NAME = >'T2, FAMILIES = > [{NAME = >'F1, BLOOMFILTER = > NONE', REPLICATION_SCOPE = >'0, VERSIONS = >'3, COMPRESSION = > 'NONE', TTL = >' 2147483647, BLOCKSIZE = > '65536, IN_MEMORY = >' false', BLOCKCACHE = > 'true'}]} T2 camera 1314720672168.16bb3d2563eab3b4e25477c64e7e771 column=info:server, timestamp=1314720672346, value=yinjie:60020.t2,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:serverstartcode, timestamp=1314720672346, value=1314716290123.t2,1314720672168.16bb3d2563eab3b4e25477c64e007e71 column=info:splitA Timestamp=1314721112130, value=REGION = > {NAME = > 't2 Magi 1314721111490.71df02214242923574b71fe5e2a19360.century, STARTKEY = >'', ENDKEY =. > row0', ENCODED = > 71df02214242923574b71fe5e2a19360, TABLE = > {{NAME = >'t _ 2}, FAMILIES = > [{NAME = >'F1', BLOOMFILTER = > 'NONE', REPLICATION_SCOPE= >' 0', VERSIONS = >'3', COMPRESSION = > 'NONE', TTL = >' 2147483647, BLOCKSIZE = > '65536', IN_MEMORY = > 'false', BLOCKCACHE = >' true'}]}} t _ 2 primer 1314720672168.16b3d2563eab3b4e25477c64e007e71 column=info:splitB, timestamp=1314721112130, value=REGION = > 't2gamrow013 1471111490.915ee8d4a32c59a4ec60e335b01ca.cards, STARTKEY = >' row0',. ENDKEY = >'', ENCODED = > 915ee8d4a32c59a4ec3960e335b061ca, TABLE = > {{NAME = >'T2, FAMILIES = > [{NAME = >'F1, BLOOMFILTER = > 'NONE', REPLICATION_SCOPE = >' 0', VERSIONS = >'3, COMPRESSION = > 'NONE', TTL = >' 2147483647, BLOCKSIZE = > '65536, IN_MEMORY = >' true'}]} t2parentice 1314721111490.71df022242923574b71fee5a1960 column=info:regioninfo, timestamp=1314721112267, value=REGION = > {NAME = > 't2parentile 13721111490.71df022242923574b71fe5e2a19360.52, ENDK EY = > 'row0', ENCODED = > 71df02214242923574b71fe5e2a19360, TABLE = > {{NAME = >' T2, FAMILIES = > [{NAME = > F1, BLOOMFILTER = > 'NONE', REPLICATION_SCOPE = >' 019, VERSIONS = >'3, COMPRESSION = > 'NONE', TTL = >' 2147483647, BLOCKSIZE = > '65536, IN_MEMORY = >' false', BLOCKCACHE = > 'true'}]} t2Participity1314721111490.71df022242423574b71fe5e2a19360 column=info:server, timestamp=1314721112267, value=yinjie:60020.t2,1314721111490.71df02214242923574b71fe5e2a19360 column=info:serverstartcode, timestamp=1314721112267, value=1314716290123.t2,row0,1314721111490.915ee8d4a32c59a4ec3960e335b0 column=info:regioninfo, timestamp=1314721112627 Value=REGION = > {NAME = > 't2jiggle 1314721111490.915ee8d4a32c59a4ec3960e335b061ca.STARTKEY = >' row61ca. 0such, ENDKEY = >'', ENCODED = > 915ee8d4a32c59a4ec3960e335b061ca, TABLE = > {NAME = >'T2', FAMILIES = > [{NAME = >'F1', BLOOMFILTER = > 'NONE', REPLICATION_SCOPE = >' 020', VERSIONS = >'3', COMPRESSION = > 'NONE', TTL = >' 2147483647', BLOCKSIZE = > '65536', IN_MEMORY = > 'false', BLOCKCACHE = >' true'}]} t2talent row0 column=info:server 1314721111490.915ee8d4a32c59a4ec3960e335b0 column=info:server, timestamp=1314721112627, value=yinjie:6002061ca.t2,row0,1314721111490.915ee8d4a32c59a4ec3960e335b0 column=info:serverstartcode, timestamp=1314721112627, value=131471629012361ca.4 row (s) in 0.0380 seconds
You can see that the region of T2 has been split.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.