Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize batch conversion from UTF8 to GB2312 and deal with BOM tags of UTF8 under linux system

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "how to realize batch conversion of UTF8 to GB2312 under linux system and handle BOM mark of UTF8". In daily operation, I believe many people have doubts on how to realize batch conversion of UTF8 to GB2312 under linux system and handle BOM mark of UTF8. Xiaobian consulted all kinds of materials and sorted out simple and easy operation methods. Hope to answer everyone's "how to achieve Linux system batch conversion UTF8 to GB2312 and UTF8 BOM mark processing" doubts help! Next, please follow the small series to learn together!

Content:

The code is as follows:

#!/ bin/bash

for loop in `find . -type f -name "*.sql" -print`do

echo $loop

mv -f $loop $loop.tmp

dos2unix $loop.tmp

file_check_utf8='file_check_utf8.log'

sed -n '1l' $loop.tmp >$file_check_utf810. if grep '^\\357\\273\\277' $file_check_utf8 >/dev/null 2>&111. then

echo 'UTF-8 BOM'

sed -n -e '1s/^...// ' -e 'w intermediate.txt' $loop.tmp14. iconv -f UTF-8 -t GB2312 -o $loop intermediate.txt15. rm -rf intermediate.txt

rm -rf $loop.tmp

elif iconv -f UTF-8 -t GB2312 $loop.tmp >/dev/null 2>&118. then

echo 'UTF-8'

iconv -f UTF-8 -t GB2312 -o $loop $loop.tmp21. rm -rf $loop.tmp

else

echo 'ANSI'

mv -f $loop.tmp $loop

fi

rm -rf $file_check_utf8

#Simulate unix2dos, requiring that the last line of the text file must have a newline 28. sed -n -e 's/$/\r/g' -e 'w '$loop.tmp $loop29. mv -f $loop.tmp $loop

done

#!/ bin/bash

for loop in `find . -type f -name "*.sql" -print`do

echo $loop

mv -f $loop $loop.tmp

dos2unix $loop.tmp

file_check_utf8='file_check_utf8.log'

sed -n '1l' $loop.tmp >$file_check_utf810. if grep '^\\357\\273\\277' $file_check_utf8 >/dev/null 2>&111. then

echo 'UTF-8 BOM'

sed -n -e '1s/^...// ' -e 'w intermediate.txt' $loop.tmp14. iconv -f UTF-8 -t GB2312 -o $loop intermediate.txt15. rm -rf intermediate.txt

rm -rf $loop.tmp

elif iconv -f UTF-8 -t GB2312 $loop.tmp >/dev/null 2>&118. then

echo 'UTF-8'

iconv -f UTF-8 -t GB2312 -o $loop $loop.tmp21. rm -rf $loop.tmp

else

echo 'ANSI'

mv -f $loop.tmp $loop

fi

rm -rf $file_check_utf8

#Simulate unix2dos, requiring that the last line of the text file must have a newline 28. sed -n -e 's/$/\r/g' -e 'w '$loop.tmp $loop29. mv -f $loop.tmp $loop

done

explain

1. UTF8 BOM processing, I did not find a good way, finally used sed+grep to determine, if the first three bytes are\\357\\273\\277, then the file must be UTF8, use sed to remove these three bytes and then convert

2. In order to avoid duplication or omission, the script uses iconv to try to convert a file without BOM. If the conversion succeeds, it means that the file is UTF8, otherwise it means ANSI, that is, GB2312.

3. About the last sed command, it is because my system does not have the unix2dos command, so I simulated it, in order to facilitate my viewing and editing under windows.

At this point, the study on "how to realize batch conversion of UTF8 to GB2312 under linux system and handle BOM marks of UTF8" is over, hoping to solve everyone's doubts. Theory and practice can better match to help everyone learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report