Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

There is a Broken pipe problem when using happybase to access HBase-two "amazing" big bug

2025-01-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Source

When using happybase to read and write data to HBase through thrift interface, there is an error of Broken pipe. Troubleshooting steps:

1 、 Check hbase's log: Java HotSpot (TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release17/05/12 18:08:41 INFO util.VersionInfo: HBase 1.2.0-cdh6.10.117/05/12 18:08:41 INFO util.VersionInfo: Source code repository file:///data/jenkins/workspace/generic-package-centos64-7-0/topdir/BUILD/hbase-1.2.0-cdh6.10.1 revision=Unknown17/05/12 18 : 08:41 INFO util.VersionInfo: Compiled by jenkins on Mon Mar 20 02:46:09 PDT 201717/05/12 18:08:41 INFO util.VersionInfo: From source with checksum c6d9864e1358df7e7f39d39a40338b4e17/05/12 18:08:41 INFO thrift.ThriftServerRunner: Using default thrift server type17/05/12 18:08:41 INFO thrift.ThriftServerRunner: Using thrift server type threadpool17/05/12 18:08:42 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-hbase.properties Hadoop-metrics2.properties17/05/12 18:08:42 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second (s). 17-05-12 18:08:42 INFO impl.MetricsSystemImpl: HBase metrics system started17/05/12 18:08:42 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter (org.mortbay.log) via org.mortbay.log.Slf4jLog17/05/12 18:08:42 INFO http.HttpRequestLog: Http request log for http.requests.thrift is not defined17/05/12 18:08: 42 INFO http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter) 18:08:42 on 17-05-12 INFO http.HttpServer: Added global filter 'clickjackingprevention' (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter) 17-05-12 18:08:42 INFO http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context thrift17/05/12 18:08:42 INFO http.HttpServer: Added filter static_user _ filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context static17/05/12 18:08:42 INFO http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs17/05/12 18:08:42 INFO http.HttpServer: Jetty bound to port 909517 Universe 05 INFO mortbay.log 12 18:08:42 INFO mortbay.log: jetty-6.1.26.cloudera.417/05/12 18:08:42 WARN mortbay.log: Can't Reuse / tmp/Jetty_0_0_0_0_9095_thrift____.vqpz9l Using / tmp/Jetty_0_0_0_0_9095_thrift____.vqpz9l_512017503248018505817/05/12 18:08:43 INFO mortbay.log: Started SelectChannelConnector@0.0.0.0:909517/05/12 18:08:43 INFO thrift.ThriftServerRunner: starting TBoundedThreadPoolServer on / 0.0.0.0:9090 with readTimeout 300000ms Min worker threads=128, max worker threads=1000, max queued requests=1000.../05/08 15:05:51 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x645132bf connecting to ZooKeeper ensemble=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:218117/05/08 15:05:51 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181 sessionTimeout=60000 watcher=hconnection-0x64513-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181 BaseZNode=/hbase17/05/08 15:05:51 INFO zookeeper.ClientCnxn: Opening socket connection to server cdh-slave3/192.168.10.219:2181. Will not attempt to authenticate using SASL (unknown error) 15:05:51 on 17-05-08 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: / 192.168.10.23 INFO zookeeper.ClientCnxn 43170, server: cdh-slave3/192.168.10.219:218117/05/08 15:05:51 INFO zookeeper.ClientCnxn: Session establishment complete on server cdh-slave3/192.168.10.219:2181, sessionid = 0x35bd74a77802148 Negotiated timeout = 60000 [caitinggui@cdh-master-slave1 example] $15:32:50 on 17-05-08 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x35bd74a7780214817/05/08 15:32:51 INFO zookeeper.ZooKeeper: Session: 0x35bd74a77802148 closed17/05/08 15:32:51 INFO zookeeper.ClientCnxn: EventThread shut down17/05/08 15:38:53 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0xb876351 connecting to ZooKeeper ensemble=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:218117/05/08 15:38:53 INFO zookeeper.ZooKeeper: Initiating client connection ConnectString=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181 sessionTimeout=60000 watcher=hconnection-0xb8763510x0, quorum=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181, baseZNode=/hbase17/05/08 15:38:53 INFO zookeeper.ClientCnxn: Opening socket connection to server cdh-master-slave1/192.168.10.23:2181. Will not attempt to authenticate using SASL (unknown error) 15:38:53 on 17-05-08 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: / 192.168.10.23 INFO zookeeper.ClientCnxn 35526, server: cdh-master-slave1/192.168.10.23:218117/05/08 15:38:53 INFO zookeeper.ClientCnxn: Session establishment complete on server cdh-master-slave1/192.168.10.23:2181, sessionid = 0x15ba3ddc6cc90d4, negotiated timeout = 60000

A preliminary inference is that hbase set a timeout, which caused the connection to be disconnected.

2. Check the official documents, but no meaningful timeout parameters 3 and similar problems with Google are found.

View similar content:

Uploaded image for project: 'HBase' HBaseHBASE-14926Hung ThriftServer; no timeout on read from client; if client crashes, worker thread gets stuck readingAgile Board ExportDetailsType: BugStatus:RESOLVEDPriority: MajorResolution: FixedAffects Version/s:2.0.0, 1.2.0, 1.1.2, 1.3.0, 1.0.3, 0.98.16Fix Version/s:2.0.0, 1.2.0, 1.3.0, 0.98.17Component/s:ThriftLabels:NoneHadoop Flags:ReviewedRelease Note: Adds a timeout to server read from clients. Adds new configs hbase.thrift.server.socket.read.timeout for setting read timeout on server socket in milliseconds. Default is 600000. Description of Thrift server is hung. All worker threads are doing this: "thrift-worker-0" daemon prio=10 tid=0x00007f0bb95c2800 nid=0xf6a7 runnable [0x00007f0b956e0000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0 (Native Method) at java.net.SocketInputStream.read (SocketInputStream.java:152) at java.net.SocketInputStream.read (SocketInputStream.java:122) at java.io.BufferedInputStream.fill (BufferedInputStream.java:235) at java.io.BufferedInputStream.read1 (BufferedInputStream.java At java.io.BufferedInputStream.read (BufferedInputStream.java:334)-locked (a java.io.BufferedInputStream) at org.apache.thrift.transport.TIOStreamTransport.read (TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll (TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame (TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport .read (TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll (TTransport.java:84) at org.apache.thrift.protocol.TCompactProtocol.readByte (TCompactProtocol.java:601) at org.apache.thrift.protocol.TCompactProtocol.readMessageBegin (TCompactProtocol.java:470) at org.apache.thrift.TBaseProcessor.process (TBaseProcessor.java:27) at org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run (TBoundedThreadPoolServer. Java:289) at org.apache.hadoop.hbase.thrift.CallQueue$Call.run (CallQueue.java:64) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:615) at java.lang.Thread.run (Thread.java:745) They never recover.I don't have client side logs.We've been here before: HBASE-4967 "connected client thrift sockets should have A server side read timeout "but this patch only got applied to fb branch (and thrift has changed since then) ps: source https://issues.apache.org/jira/browse/HBASE-149264, Google "hbase.thrift.server.socket.read.timeout"

You can see a web page content:

The test environment is a Hadoop distributed environment built by three servers. The Hadoop version is: hadoop-2.7.3;Hbase-1.2.4; zookeeper-3.4.9. Use the thrift C++ interface to write data to hbase. Every time, the write is normal at the beginning, and the error begins after a period of time. But the previous version of hbase-0.94.27 has not encountered this problem, the configuration is the same, has been used well. Thrift interface error solution by grabbing the packet, we can see that the hbase server responded to the RST packet, causing the connection to be interrupted. The readTimeout can be set to 60s through the bin/hbase thrift start-threadpool command. Thriftpool is verified but related to this setting. This item has not been configured in the configuration. By looking at the code, it is found that 60s is the default value. If there is no configuration, the value shall prevail. So add the configuration to the conf/hbase-site.xml: hbase.thrift.server.socket.read.timeout 6000000 eg:milisecondps: source http://blog.csdn.net/wwlhz/article/details/56012053

So after adding parameters, restart hbase thrift and find that the problem is solved.

5. Looking at the source code, you can see # https://github.com/apache/hbase/blob/master/hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java... Public static final String THRIFT_SERVER_SOCKET_READ_TIMEOUT_KEY = "hbase.thrift.server.socket.read.timeout"; public static final int THRIFT_SERVER_SOCKET_READ_TIMEOUT_DEFAULT = 60000. Int readTimeout = conf.getInt (THRIFT_SERVER_SOCKET_READ_TIMEOUT_KEY, THRIFT_SERVER_SOCKET_READ_TIMEOUT_DEFAULT); TServerTransport serverTransport = new TServerSocket (new TServerSocket.ServerSocketTransportArgs (). BindAddr (new InetSocketAddress (listenAddress, listenPort)). Backlog (backlog). ClientTimeout (readTimeout))

Problem solving ~ ~

6. But has the problem been solved?

In fact, there is still a problem. After a period of time, I found that after about 20 minutes of continuous scan, the connection was disconnected again, which was another difficult search. It was found to be a problem inherent in this version of hbase. It defaults all connections (whether they are in use or not) to the status of idle, and then has a hbase.thrift.connection.max-idletime configuration, so I set this to 31104000 (one year). If it is in CDH, it should be configured on the management page. As shown in the figure:

General steps to encounter problems:

Technologically advanced type:

1. Check the log, check where the error is reported, and initially locate the problem.

2. Check the official documents

3. Similar problems with Google, or check the source code to locate the problem.

Quick problem solving:

1. Check the log, check where the error is reported, and initially locate the problem.

2. Google similarity problem

3. Check the official documents or the source code

Reference:

[1] guidelines for using HBase thrift/thrift2

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report