In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
Editor today to show you how to run nutch error report unzipBestEffort returned null, the knowledge points in the article are introduced in great detail. Friends who feel helpful can follow the editor to browse the content of the article, hoping to help more friends who want to solve this problem to find the answer to the problem. Follow the editor to learn more about "how to run nutch error report unzipBestEffort returned null".
Error message: fetch of http://szs.mof.gov.cn/zhengwuxinxi/zhengcefabu/201402/t20140224_1046354.html failed with: java.io.IOException: unzipBestEffort returned null
The complete error message is:
2014-03-12 16 unzipBestEffort returned nullat org.apache.nutch.protocol.http.api.HttpBase.processGzipEncoded ERROR http.Http-Failed to get protocol outputjava.io.IOException: unzipBestEffort returned nullat org.apache.nutch.protocol.http.api.HttpBase.processGzipEncoded (HttpBase.java:317) at org.apache.nutch.protocol.http.HttpResponse. (HttpResponse.java:164) at org.apache.nutch.protocol.http.Http.getResponse (Http.java:64) at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput (HttpBase.java:140) at org.apache.nutch .fetcher.Fetcher $FetcherThread.run (Fetcher.java:703) 2014-03-12 16 purse 4848 INFO fetcher.Fetcher 38031 INFO fetcher.Fetcher-fetch of http://szs.mof.gov.cn/zhengwuxinxi/zhengcefabu/201402/t20140224_1046354.html failed with: java.io.IOException: unzipBestEffort returned null2014-03-12 16 purge 4848 purvey38031 INFO fetcher.Fetcher-- finishing thread FetcherThread ActiveThreads=0
You can see that the code that throws the exception is on line 317 of the processGzipEncoded method of the src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java (lib-http plug-in) class:
Byte [] content;if (getMaxContent () > = 0) {content = GZIPUtils.unzipBestEffort (compressed, getMaxContent ());} else {content = GZIPUtils.unzipBestEffort (compressed);} if (content = = null) throw new IOException ("unzipBestEffort returned null")
The processGzipEncoded method is called on line 164of nutch2.7\ src\ plugin\ protocol-http\ src\ java\ org\ apache\ nutch\ protocol\ http\ HttpResponse.java (protocol-http plug-in):
ReadPlainContent (in); String contentEncoding = getHeader (Response.CONTENT_ENCODING); if ("gzip" .equals (contentEncoding) | | "x-gzip" .equals (contentEncoding)) {content = http.processGzipEncoded (content, url);} else if ("deflate" .equals (contentEncoding)) {content = http.processDeflateEncoded (content, url) } else {if (Http.LOG.isTraceEnabled ()) {Http.LOG.trace ("fetched" + content.length + "bytes from" + url);}}
Through the Firebug tool of Firefox, you can see that the response header of this URL is Content-Encoding:gzip,Transfer-Encoding:chunked.
The solution is as follows:
1. Modify the file nutch2.7\ src\ java\ org\ apache\ nutch\ metadata\ HttpHeaders.java, and add a field:
Public final static String TRANSFER_ENCODING = "Transfer-Encoding"
2. Modify the file nutch2.7\ src\ plugin\ protocol-http\ src\ java\ org\ apache\ nutch\ protocol\ http\ HttpResponse.java, and replace line 160 readPlainContent (in) with the following code
String transferEncoding = getHeader (Response.TRANSFER_ENCODING); if (transferEncoding! = null & & "chunked" .equals IgnoreCase (transferEncoding.trim () {readChunkedContent (in, line);} else {readPlainContent (in);}
3. Http content length limit cannot use a negative value, but can only use a large integer:
Http.content.limit 655360000
4. Because the core code and plug-in code have been modified, you need to recompile the packaged release and execute the default target:runtime of nutch2.7\ build.xml.
Cd nutch2.7ant thank you for your reading, the above is "run nutch error unzipBestEffort returned null how to do" all the content, learn friends hurry up to operate it. I believe that the editor will certainly bring you better quality articles. Thank you for your support to the website!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.