Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of hbase-site.xml and hbase-default.xml

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the example analysis of hbase-site.xml and hbase-default.xml, the content is very detailed, interested friends can refer to, hope to be helpful to you.

Just as Hadoop places the configuration file of HDFS, the configuration file of hdfs-site.xml,Hbase is conf/hbase-site.xml. You can find a list of configured properties in Section 3.1.1, "HBase default configuration". You can also look at the hbase-default.xml file in the code, which is in the src/main/resources directory.

Not all configurations appear in hbase-default.xml. As long as you change the code, the configuration is likely to change, so the only way to understand these modified configurations is to read the source code itself.

It is important to note that you have to restart the cluster for the configuration to take effect.

HBase default configuration

HBase default configuration

The document is generated using the hbase default configuration file, and the file source is hbase-default.xml (translated into Chinese comments due to translation needs).

Hbase.rootdir

This directory is a shared directory for region server and is used to persist Hbase. The URL needs to be 'completely correct' and also contains the scheme of the file system. For example, to represent the'/ hbase' directory in hdfs, namenode runs on port 9090 of namenode.example.org. Then you need to set it to hdfs://namenode.example.org:9000/hbase. By default, Hbase is written to / tmp. If you do not change this configuration, the data will be lost when you restart.

Default: file:///tmp/hbase-${user.name}/hbase

Hbase.master.port

The port of Hbase's Master.

Default: 60000

Hbase.cluster.distributed

The operation mode of Hbase. False is in stand-alone mode and true is in distributed mode. If false,Hbase and Zookeeper run in the same JVM.

Default: false

Hbase.tmp.dir

A temporary folder for the local file system. Can be modified to a more persistent directory. (/ tmp will be clear when restarting)

Default: / tmp/hbase-$ {user.name}

Hbase.master.info.port

HBase Master web interface port. Setting it to-1 means you don't want it to run.

Default: 60010

Hbase.master.info.bindAddress

Ports bound by the HBase Master web interface

Default: 0.0.0.0

Hbase.client.write.buffer

The default size of the write buffer for HTable clients. The higher this value, the more memory you need to consume. Because buffering has instances on both the client and the server, it consumes memory in both the client and the server. The benefit is that the number of RPC can be reduced. The memory occupied on the server side can be estimated as follows: hbase.client.write.buffer * hbase.regionserver.handler.count

Default: 2097152

Hbase.regionserver.port

Port bound by HBase RegionServer

Default: 60020

Hbase.regionserver.info.port

Setting the HBase RegionServer web interface binding port to-1 means you don't want to run the RegionServer interface.

Default: 60030

Hbase.regionserver.info.port.auto

Whether Master or RegionServer should dynamically search for an available port to bind the interface. When hbase.regionserver.info.port is already occupied, you can search for a free port binding. This feature is useful in testing. Off by default.

Default: false

Hbase.regionserver.info.bindAddress

The IP address of the HBase RegionServer web interface

Default: 0.0.0.0

Hbase.regionserver.class

The interface used by RegionServer. It is used when the client opens a proxy to connect to the region server.

Default: org.apache.hadoop.hbase.ipc.HRegionInterface

Hbase.client.pause

The usual client pause time. The most common usage is the waiting time of the client before retrying. For example, failed get operations and region query operations are likely to be used.

Default: 1000

Hbase.client.retries.number

Maximum number of retries. For example, region queries, Get operations, Update operations, and so on, may have errors and need to be retried. This is the value of the maximum retry error.

Default: 10

Hbase.client.scanner.caching

The number of rows fetched from the server at a time when the next method of Scanner is called and the value is not in the cache. A higher value means that Scanner is faster, but takes up more memory. When the buffer is full, the next method call becomes slower and slower. Slow to a certain extent may lead to a timeout. For example, it exceeds the hbase.regionserver.lease.period.

Default: 1

Hbase.client.keyvalue.maxsize

The maximum size of a KeyValue instance. This is used to set the upper bound of the size of a single entry in the storage file. Because a KeyValue cannot be split, it can be avoided that the region is indivisible because the data is too large. It would be wise to set it to the number that is divisible by the maximum region size. If set to 0 or less, this check is disabled. Default 10MB.

Default: 10485760

Hbase.regionserver.lease.period

The client lease HRegion server period, that is, the timeout threshold. The unit is milliseconds. By default, the client must send a message within this time, otherwise it is considered dead.

Default: 60000

Hbase.regionserver.handler.count

The number of RPC Server instances offered by RegionServers. For Master, this attribute is the number of handler offered by Master

Default: 10

Hbase.regionserver.msginterval

Interval between RegionServer sending messages to Master (in milliseconds)

Default: 3000

Hbase.regionserver.optionallogflushinterval

The interval between synchronizing Hlog to HDFS. If the Hlog does not accumulate to a certain amount, synchronization will also be triggered when the time is up. The default is 1 second, in milliseconds.

Default: 1000

Hbase.regionserver.regionSplitLimit

When the number of region reaches this value, it will not split again. This is not a hard limit on the number of region. But it plays a guiding role, and at this value it is time to stop splitting. The default is MAX_INT. That is to say, do not stop the split.

Default: 2147483647

Hbase.regionserver.logroll.period

The interval at which the commit log is submitted, regardless of whether enough values have been written.

Default: 3600000

Hbase.regionserver.hlog.reader.impl

The realization of HLog file reader.

Default: org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader

Hbase.regionserver.hlog.writer.impl

The realization of HLog file writer.

Default: org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter

Hbase.regionserver.thread.splitcompactcheckfrequency

How often does region server perform split/compaction checks?

Default: 20000

Hbase.regionserver.nbreservationblocks

The number of block stored in memory. When an out of memory exception occurs, we can use these memory RegionServer to clean up before stopping.

Default: 4

Hbase.zookeeper.dns.interface

When using DNS, the network interface name of the IP address that Zookeeper uses to report.

Default: default

Hbase.zookeeper.dns.nameserver

When using DNS, Zookeepr uses the domain name or IP address of DNS, which Zookeeper uses to determine the domain name used to communicate with master.

Default: default

Hbase.regionserver.dns.interface

When using DNS, the network interface name of the IP address that RegionServer uses to report.

Default: default

Hbase.regionserver.dns.nameserver

When using DNS, RegionServer uses the domain name or IP address of DNS, which RegionServer uses to determine the domain name used to communicate with master.

Default: default

Hbase.master.dns.interface

When using DNS, the network interface name of the IP address that Master uses to report.

Default: default

Hbase.master.dns.nameserver

When using DNS, RegionServer uses the domain name or IP address of DNS, which Master uses to determine the domain name used for communication.

Default: default

Hbase.balancer.period

The interval at which the Master executes the region balancer.

Default: 300000

Hbase.regions.slop

When any regionserver has average + (average * slop) region, it will execute Rebalance.

Default: 0

Hbase.master.logcleaner.ttl

The maximum time that Hlog exists in the. oldlogdir folder will be cleaned up by the thread of Master.

Default: 600000

Hbase.master.logcleaner.plugins

A set of LogCleanerDelegat that the LogsCleaner service executes. Values are represented by text spaced by commas. These WAL/HLog cleaners are called sequentially. You can put the ones that are called first first. You can implement your own LogCleanerDelegat, add it to Classpath, and write down the full name of the class here. It is usually added before the default value.

Default: org.apache.hadoop.hbase.master.TimeToLiveLogCleaner

Hbase.regionserver.global.memstore.upperLimit

The maximum of all memtores for a single region server. Beyond this value, a new update operation is suspended, forcing the flush operation.

Default: 0.4

Hbase.regionserver.global.memstore.lowerLimit

When the flush operation is enforced, flush stops when it falls below this value. The default is 35% of the heap size. If this value is the same as hbase.regionserver.global.memstore.upperLimit, it means that when the update operation is suspended due to memory constraints, the flush will be executed as little as possible.

Default: 0.35

Hbase.server.thread.wakefrequency

The sleep interval in milliseconds for service work. Can be used as the sleep interval for service threads, such as log roller.

Default: 10000

Hbase.hregion.memstore.flush.size

When the size of the memstore exceeds this value, it will be flush to disk. This value is checked by a thread every other hbase.server.thread.wakefrequency.

Default: 67108864

Hbase.hregion.preclose.flush.size

When the size of the memstore in a region is greater than this value, we trigger the close. It will first run the "pre-flush" operation, clean up the memstore that needs to be closed, and then take the region offline. When a region goes offline, we can't do any more writes. If a memstore is very large, the flush operation will take a lot of time. The "pre-flush" operation means that the memstore is emptied before the region goes offline. In this way, when the close operation is finally performed, the flush operation will be very fast.

Default: 5242880

Hbase.hregion.memstore.block.multiplier

If memstore has the size of a hbase.hregion.flush.size that is a multiple of hbase.hregion.memstore.block.multiplier, the update operation is blocked. This is to prevent the loss of control caused by the peak of update. If there is no upper bound, flush will take a long time to merge or split, and in the worst case, an out of memory exception will be thrown. (translator's note: the speed of memory operation does not match the disk, we need to wait. The original text seems to be incorrect)

Default: 2

Hbase.hregion.memstore.mslab.enabled

Experience feature: enable memStore to allocate local buffers. The purpose of this feature is to prevent excessive fragmentation of the heap when there is a large write load. This reduces the frequency of GC operations. (GC may Stop the world) (the principle of implementation is equivalent to pre-allocated memory, not every value has to be allocated from the heap)

Default: false

Hbase.hregion.max.filesize

Maximum HStoreFile size. If the HStoreFile growth of a Column families reaches this value, the Hegion will be cut into two. Default: 256M.

Default: 268435456

Hbase.hstore.compactionThreshold

When a HStore contains more than this value of HStoreFiles (each memstore flush produces a HStoreFile), a merge operation is performed to write the HStoreFiles into one. The higher this value, the longer it takes to merge.

Default: 3

Hbase.hstore.blockingStoreFiles

When a HStore contains more than this value of HStoreFiles (each memstore flush produces a HStoreFile), a merge operation is performed, and the update blocks until the merge is completed until the value of hbase.hstore.blockingWaitTime is exceeded.

Default: 7

Hbase.hstore.blockingWaitTime

The number of StoreFile limited by hbase.hstore.blockingStoreFiles will cause update blocking, which is used to limit the blocking time. When this time is exceeded, HRegion will stop blocking update operations, but the merge has not yet been completed. The default is 90s.

Default: 90000

Hbase.hstore.compaction.max

The maximum number of HStoreFiles merged per "small".

Default: 10

Hbase.hregion.majorcompaction

The interval between the major compactions of all HStoreFile in a Region. The default is 1 day. Setting to 0 disables this feature.

Default: 86400000

Hbase.mapreduce.hfileoutputformat.blocksize

HFileOutputFormat in MapReduce can write storefiles/hfiles. This value is the minimum blocksize for hfile. Usually when Hbase writes Hfile, bloocksize is determined by table schema (HColumnDescriptor), but when mapreduce writes, we cannot get the blocksize in schema. The smaller the value, the larger your index, and the smaller the data you need to fetch for random access. If your cell is small and you need faster random access, you can lower this value.

Default: 65536

Hfile.block.cache.size

The percentage of block cache allocated to HFile/StoreFile to the largest heap (- Xmx setting). The default is 20%. Setting it to 0 means no allocation.

Default: 0.2

Hbase.hash.type

The hash algorithm used by the hash function. You can choose two values: murmur (MurmurHash) and jenkins (JenkinsHash). This hash is for bloom filters.

Default: murmur

Hbase.master.keytab.file

HMaster server verifies the path of the kerberos keytab file used for login. (translator's note: Hbase uses Kerberos for security)

Default:

Hbase.master.kerberos.principal

For example. "hbase/_HOST@EXAMPLE.COM". Kerberos principal name. HMaster is required to run. Principal name can be obtained in: user/hostname@DOMAIN. If "_ HOST" is used as a hostname portion, you need to replace it with the actual running hostname.

Default:

Hbase.regionserver.keytab.file

HRegionServer verifies the path of the kerberos keytab file used for login.

Default:

Hbase.regionserver.kerberos.principal

For example. "hbase/_HOST@EXAMPLE.COM". Kerberos principal name. HRegionServer is required to run. Principal name can be obtained in: user/hostname@DOMAIN. If "_ HOST" is used as a hostname portion, you need to replace it with the actual running hostname. There must be an entry to describe hbase.regionserver.keytab.file in this file

Default:

Zookeeper.session.timeout

ZooKeeper session timeout. HBASE passes this value to the zk cluster, recommending the maximum timeout for a session. See http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions "The client sends a requested timeout, the server responds with the timeout that it can give the client." for details. The unit is millisecond

Default: 180000

Zookeeper.znode.parent

The root ZNode of the Hbase in ZooKeeper. All Hbase ZooKeeper will use this directory to configure relative paths. By default, all Hbase ZooKeeper file paths are relative, so they all go to this directory.

Default: / hbase

Zookeeper.znode.rootserver

The path to the root region saved by ZNode. This value is written by Master and read by client and regionserver. If set to a relative address, the parent directory is ${zookeeper.znode.parent}. By default, it means that the path to the root region is stored in / hbase/root-region-server.

Default: root-region-server

Hbase.zookeeper.quorum

The address list of the Zookeeper cluster, separated by commas. For example: host1.mydomain.com,host2.mydomain.com,host3.mydomain.com. The default is localhost, which is for pseudo-distributed use. It needs to be modified to use it in a fully distributed situation. If HBASE_MANAGES_ZK is set on hbase-env.sh, these ZooKeeper nodes will be started together with Hbase.

Default: localhost

Hbase.zookeeper.peerport

The port used by the ZooKeeper node.

This is the end of the sample analysis on hbase-site.xml and hbase-default.xml. I hope the above content can be of some help and learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report