Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to automatically convert character sets and support array conversion in PHP

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces how PHP automatically converts character sets and supports array conversion. This article is very detailed and has a certain reference value. Interested friends must read it!

The code is as follows:

/ / automatic conversion character set supports array conversion

Function auto_charset ($fContents, $from='gbk', $to='utf-8') {

$from = strtoupper ($from) = = 'UTF8'? 'utf-8': $from

$to = strtoupper ($to) = = 'UTF8'? 'utf-8': $to

If (strtoupper ($from) = strtoupper ($to) | | empty ($fContents) | | (is_scalar ($fContents) & &! is_string ($fContents)) {

/ / do not convert if the encoding is the same or is not a string scalar

Return $fContents

}

If (is_string ($fContents)) {

If (function_exists ('mb_convert_encoding')) {

Return mb_convert_encoding ($fContents, $to, $from)

} elseif (function_exists ('iconv')) {

Return iconv ($from, $to, $fContents)

} else {

Return $fContents

}

} elseif (is_array ($fContents)) {

Foreach ($fContents as $key = > $val) {

$_ key = auto_charset ($key, $from, $to)

$fContents [$_ key] = auto_charset ($val, $from, $to)

If ($key! = $_ key)

Unset ($fContents [$key])

}

Return $fContents

}

Else {

Return $fContents

}

}

When we are accepting data submitted by unknown clients, because the coding of each client is not uniform, but in the end, we can only deal with it in one way on our server side. In this case, it will involve a problem of converting the received characters into specific encodings.

At this time, you may think of transcoding directly with iconv, but we know that the two parameters provided by the iconv function are input encoding and output encoding, and now we have no idea what the accepted string is. It would be nice if we could get the received character encoding at this time.

There are generally two solutions to such problems.

Option one

When you want the client to submit data, specify the submitted encoding, and you need to give one more variable to specify the encoding.

$string = $_ GET ['charset'] = =' gbk'? Iconv ('gbk','utf-8',$_GET [' str']): $_ GET ['str']

In this case, if there is no agreement or we can't control the client, it doesn't seem to work very well.

Option 2

The received data encoding is detected directly by the server.

This scheme is of course the most ideal, but now the question is how to detect the encoding of a character? In this case, in php, mb_check_encoding in the extension mb_string provides the functionality we need.

$str = mb_check_encoding ($_ GET ['str'],' gbk')? Iconv ('gbk','utf-8',$_GET [' str']): $_ GET ['str']

But to do this, you need to open the mb_string extension, and sometimes it may not be open in our production server. In this case, you need to judge the coding with the help of the following function.

The following functions are not written by me

The copy code is as follows:

Function isGb2312 ($string) {

For ($iTun0; $I 127) {

If (($v > = 228) & ($v)

< = 233) ) { if( ($i+2) >

= (strlen ($string)-1)) return true

$v1 = ord ($string [$item1])

$v2 = ord ($string [$item2])

If (($v1 > = 128) & ($v1)

< =191) && ($v2 >

= 128) & & ($v2 < = 191))

Return false

Else

Return true

}

}

}

Return true

}

Function isUtf8 ($string) {

Return preg_match ('% ^)

[\ x09\ x0A\ x0D\ x20 -\ x7e] # ASCII

| | [\ xC2-\ xDF] [\ x80 -\ xBF] # non-overlong 2-byte |

| |\ xE0 [\ xA0-\ xBF] [\ x80 -\ xBF] # excluding overlongs |

| | [\ xE1-\ xEC\ xEE\ xEF] [\ x80 -\ xBF] {2} # straight 3-byte |

| |\ xED [\ x80 -\ x9F] [\ x80 -\ xBF] # excluding surrogates |

| |\ xF0 [\ x90 -\ xBF] [\ x80 -\ xBF] {2} # planes 1-3 |

| | [\ xF1-\ xF3] [\ x80 -\ xBF] {3} # planes 4-15 |

| |\ xF4 [\ x80 -\ x8F] [\ x80 -\ xBF] {2} # plane 16 |

) * $% xs', $string)

}

Here we can use any of the above functions to detect the code. And converts it to the specified encoding.

$str = isGb2312 ($_ GET ['str'],' gbk')? Iconv ('gbk','utf-8',$_GET [' str']): $_ GET ['str']

These are all the contents of the article "how to automatically convert character sets and support array conversion in PHP". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report