The low memory of Hbase heap in a bug life-and-death experience leads to regionser 07/19 Update SLTechnology News&Howtos

The low memory of Hbase heap in a bug life-and-death experience leads to regionser

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

The environment is as follows:

Centos6.5

Apache Hadoop2.7.1

Apache Hbase0.98.12

Apache Zookeeper3.4.6

JDK1.7

Ant1.9.5

Maven3.0.5

Recently, in testing the compression of Hbase, Hadoop installed lzo and snappy, inserting 50 pieces of text data, each about 4m, to see their compression ratio comparison.

Then, in the process of testing, it is found that when using the java client to scan these 50 pieces of data, regionserver frequently downtime to see the log of hbase and found no obvious exception. Check the log of datanode and find the following exception:

Java code

Java.io.IOException: Premature EOF from inputStream

At org.apache.hadoop.io.IOUtils.readFully (IOUtils.java:201)

At org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully (PacketReceiver.java:213)

At org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead (PacketReceiver.java:134)

At org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket (PacketReceiver.java:109)

At org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket (BlockReceiver.java:472)

At org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock (BlockReceiver.java:849)

At org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock (DataXceiver.java:804)

At org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock (Receiver.java:137)

At org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp (Receiver.java:74)

At org.apache.hadoop.hdfs.server.datanode.DataXceiver.run (DataXceiver.java:251)

At java.lang.Thread.run (Thread.java:745)

Java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully (IOUtils.java:201) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully (PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead (PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket (PacketReceiver.java:109) at org. Apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket (BlockReceiver.java:472) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock (BlockReceiver.java:849) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock (DataXceiver.java:804) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock (Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver. ProcessOp (Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run (DataXceiver.java:251) at java.lang.Thread.run (Thread.java:745)

The screenshot is as follows. OK, if there is an exception, I take the google search results of this exception and find that there is no clear answer. Most of them say that the link timed out, or the number of handles is full, resulting in a break of the link, and so on. Then press these answers, change a number of configurations, and find that it still does not take effect, which leads me to a very strange conclusion that hbase does not support tables with multiple compression types. Then I removed other types of tables used to compress the test, tested it again, and found that the problem remained the same, which once again surprised me, could it be the environment? Because I really couldn't figure out what the possible problem was, so I tested it on the local virtual machine.

The environment of the virtual machine, because it is for your own use, the JDK version 1.8 and the Centos version 7 are consistent.

After building, continue to test, found that the problem is still the same, this is even more confusing, you can see the non-environmental problems, but this time there is a new clue, due to the use of JDK8, in the Hbase log found that there are a large number of full gc logs, meaning a serious shortage of memory, resulting in garbage collection time of 415 seconds, then I have a clue, hbase is a memory-eating thing, less memory to give It is indeed possible to cause regionserver to hang up, so I checked the heap memory allocation of hbase and found that it is the default 1G, which really has a lot to do with this. 50 pieces of data account for 200m of storage. If you scan once, hbase will cache it in cache, and continue to scan tables of different compression types for the second time, which will lead to memory expansion, which will lead to regionserver downtime, and the exception prompt given is not very clear. That's why it's difficult to locate the problem, know the reason, and then transfer the hbase heap memory to 4G and distribute it to all nodes, start it again, use the java client, scan the full table for testing, and this time it's very stable, and regionserver doesn't hang up again.

Finally, a conclusion of test compression is given: a total of 4 kinds of compression comparisons are measured, and the original data is 200m.

(1) there is no need to compress 128.1 M of space

(2) gz compression accounts for 920.3 K

(3) snappy compression accounts for 13.2m.

(4) lzo compression accounts for 8m

As can be seen above, the compression ratio of gz is the highest, lzo is the second, and snappy is the third. Of course, different compression is suitable for unused business scenarios.

Summarize what must be used, in which snappy and lzo are used by most Internet companies at present, so we can choose the appropriate compression scheme according to the specific business.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.