Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the problem of php gzip css garbled code

2025-03-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly shows you "how to solve the php gzip css garbled problem", the content is easy to understand, clear, hope to help you solve the doubt, the following let the editor lead you to study and learn "how to solve the php gzip css garbled problem" this article.

Php gzip css garbled solutions: 1, the use of its own zlib library; 2, the use of CURL instead of "file_get_contents"; 3, the use of gzip decompression function to solve the garbled problem.

This article operating environment: Windows7 system, PHP7.1 version, DELL G3 computer.

Three methods to solve the problem of grabbing Gzip pages by php file_get_contents

Using the file_get_contents () function to crawl a web page will cause garbled code. There are two reasons for garbled code, one is the coding problem, the other is that the target page has opened Gzip, the following is how to turn on the Gzip function to avoid garbled code.

Just encode the crawled content ($content=iconv ("GBK", "UTF-8//IGNORE", $content). Here we are talking about how to crawl the page that opened the Gzip. How do you judge? Content-Encoding: gzip indicates that the content is compressed by GZIP in the obtained header. Use FireBug to take a look at the page to see whether the gzip is open or not. The following is to use firebug to check the head information of my blog, Gzip is open.

The code is as follows:

Request header information original header information

Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8Accept-Encoding gzip, deflateAccept-Language zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3Connection keep-aliveCookie _ _ utma=225240837.787252530.1317310581.1335406161.1335411401.1537 _ _ utmz=225240837.1326850415.887.3.utmcsr=google | utmccn= (organic) | utmcmd=organic | utmctr=%E4%BB%BB%E4%BD%95%E9%A1%B9%E7%9B%AE%E9%83%BD%E4%B8%8D%E4%BC%9A%E9%82%A3%E4%B9%88%E7%AE%80%E5%8D%95%20site%3Awww.nowamagic.net; PHPSESSID=888mj4425p8s0m7s0frre3ovc7; _ _ utmc=225240837; _ _ utmb=225240837.1.10.1335411401Host www.nowamagic.netUser-Agent Mozilla/5.0 (Windows NT 5.1 Rv:12.0) Gecko/20100101 Firefox/12.0

Here are some solutions:

1. Use the built-in zlib library

If the server already has the zlib library installed, you can easily solve the garbled problem with the following code.

The code is as follows:

$data = file_get_contents ("compress.zlib://". $url)

two。 Use CURL instead of file_get_contents

The code is as follows:

Function curl_get ($url, $gzip=false) {$curl = curl_init ($url); curl_setopt ($curl, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($curl, CURLOPT_CONNECTTIMEOUT, 10); if ($gzip) curl_setopt ($curl, CURLOPT_ENCODING, "gzip"); / / key here $content = curl_exec ($curl); curl_close ($curl); return $content;}

3. Use the gzip decompression function

The code is as follows:

Function gzdecode ($data) {$len = strlen ($data); if ($len)

< 18 || strcmp(substr($data,0,2),"\x1f\x8b")) { return null; // Not GZIP format (See RFC 1952) } $method = ord(substr($data,2,1)); // Compression method $flags = ord(substr($data,3,1)); // Flags if ($flags & 31 != $flags) { // Reserved bits are set -- NOT ALLOWED by RFC 1952 return null; } // NOTE: $mtime may be negative (PHP integer limitations) $mtime = unpack("V", substr($data,4,4)); $mtime = $mtime[1]; $xfl = substr($data,8,1); $os = substr($data,8,1); $headerlen = 10; $extralen = 0; $extra = ""; if ($flags & 4) { // 2-byte length prefixed EXTRA data in header if ($len - $headerlen - 2 < 8) { return false; // Invalid format } $extralen = unpack("v",substr($data,8,2)); $extralen = $extralen[1]; if ($len - $headerlen - 2 - $extralen < 8) { return false; // Invalid format } $extra = substr($data,10,$extralen); $headerlen += 2 + $extralen; } $filenamelen = 0; $filename = ""; if ($flags & 8) { // C-style string file NAME data in header if ($len - $headerlen - 1 < 8) { return false; // Invalid format } $filenamelen = strpos(substr($data,8+$extralen),chr(0)); if ($filenamelen === false || $len - $headerlen - $filenamelen - 1 < 8) { return false; // Invalid format } $filename = substr($data,$headerlen,$filenamelen); $headerlen += $filenamelen + 1; } $commentlen = 0; $comment = ""; if ($flags & 16) { // C-style string COMMENT data in header if ($len - $headerlen - 1 < 8) { return false; // Invalid format } $commentlen = strpos(substr($data,8+$extralen+$filenamelen),chr(0)); if ($commentlen === false || $len - $headerlen - $commentlen - 1 < 8) { return false; // Invalid header format } $comment = substr($data,$headerlen,$commentlen); $headerlen += $commentlen + 1; } $headercrc = ""; if ($flags & 1) { // 2-bytes (lowest order) of CRC32 on header present if ($len - $headerlen - 2 < 8) { return false; // Invalid format } $calccrc = crc32(substr($data,0,$headerlen)) & 0xffff; $headercrc = unpack("v", substr($data,$headerlen,2)); $headercrc = $headercrc[1]; if ($headercrc != $calccrc) { return false; // Bad header CRC } $headerlen += 2; } // GZIP FOOTER - These be negative due to PHP's limitations $datacrc = unpack("V",substr($data,-8,4)); $datacrc = $datacrc[1]; $isize = unpack("V",substr($data,-4)); $isize = $isize[1]; // Perform the decompression: $bodylen = $len-$headerlen-8; if ($bodylen < 1) { // This should never happen - IMPLEMENTATION BUG! return null; } $body = substr($data,$headerlen,$bodylen); $data = ""; if ($bodylen >

0) {switch ($method) {case 8: / / Currently the only supported compression method: $data = gzinflate ($body); break; default: / / Unknown compression method return false;}} else {/ / I'm not sure if zero-byte body content is allowed. / / Allow it for now... Do nothing... } / / Verifiy decompressed size and CRC32: / / NOTE: This may fail with large data sizes depending on how / / PHP's integer limitations affect strlen () since $isize / / may be negative for large sizes. If ($isize! = strlen ($data) | | crc32 ($data)! = $datacrc) {/ / Bad format! Length or CRC doesn't match! Return false;} return $data;}

Use:

The code is as follows:

$html=file_get_contents ('https://www.jb51.net/');$html=gzdecode($html);

The introduction of these three methods should be able to solve most of the crawling garbled problems caused by gzip.

The above is all the contents of the article "how to solve the php gzip css garbled problem". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report