In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "how to realize batch conversion of UTF8 to GB2312 under linux system and handle BOM mark of UTF8". In daily operation, I believe many people have doubts on how to realize batch conversion of UTF8 to GB2312 under linux system and handle BOM mark of UTF8. Xiaobian consulted all kinds of materials and sorted out simple and easy operation methods. Hope to answer everyone's "how to achieve Linux system batch conversion UTF8 to GB2312 and UTF8 BOM mark processing" doubts help! Next, please follow the small series to learn together!
Content:
The code is as follows:
#!/ bin/bash
for loop in `find . -type f -name "*.sql" -print`do
echo $loop
mv -f $loop $loop.tmp
dos2unix $loop.tmp
file_check_utf8='file_check_utf8.log'
sed -n '1l' $loop.tmp >$file_check_utf810. if grep '^\\357\\273\\277' $file_check_utf8 >/dev/null 2>&111. then
echo 'UTF-8 BOM'
sed -n -e '1s/^...// ' -e 'w intermediate.txt' $loop.tmp14. iconv -f UTF-8 -t GB2312 -o $loop intermediate.txt15. rm -rf intermediate.txt
rm -rf $loop.tmp
elif iconv -f UTF-8 -t GB2312 $loop.tmp >/dev/null 2>&118. then
echo 'UTF-8'
iconv -f UTF-8 -t GB2312 -o $loop $loop.tmp21. rm -rf $loop.tmp
else
echo 'ANSI'
mv -f $loop.tmp $loop
fi
rm -rf $file_check_utf8
#Simulate unix2dos, requiring that the last line of the text file must have a newline 28. sed -n -e 's/$/\r/g' -e 'w '$loop.tmp $loop29. mv -f $loop.tmp $loop
done
#!/ bin/bash
for loop in `find . -type f -name "*.sql" -print`do
echo $loop
mv -f $loop $loop.tmp
dos2unix $loop.tmp
file_check_utf8='file_check_utf8.log'
sed -n '1l' $loop.tmp >$file_check_utf810. if grep '^\\357\\273\\277' $file_check_utf8 >/dev/null 2>&111. then
echo 'UTF-8 BOM'
sed -n -e '1s/^...// ' -e 'w intermediate.txt' $loop.tmp14. iconv -f UTF-8 -t GB2312 -o $loop intermediate.txt15. rm -rf intermediate.txt
rm -rf $loop.tmp
elif iconv -f UTF-8 -t GB2312 $loop.tmp >/dev/null 2>&118. then
echo 'UTF-8'
iconv -f UTF-8 -t GB2312 -o $loop $loop.tmp21. rm -rf $loop.tmp
else
echo 'ANSI'
mv -f $loop.tmp $loop
fi
rm -rf $file_check_utf8
#Simulate unix2dos, requiring that the last line of the text file must have a newline 28. sed -n -e 's/$/\r/g' -e 'w '$loop.tmp $loop29. mv -f $loop.tmp $loop
done
explain
1. UTF8 BOM processing, I did not find a good way, finally used sed+grep to determine, if the first three bytes are\\357\\273\\277, then the file must be UTF8, use sed to remove these three bytes and then convert
2. In order to avoid duplication or omission, the script uses iconv to try to convert a file without BOM. If the conversion succeeds, it means that the file is UTF8, otherwise it means ANSI, that is, GB2312.
3. About the last sed command, it is because my system does not have the unix2dos command, so I simulated it, in order to facilitate my viewing and editing under windows.
At this point, the study on "how to realize batch conversion of UTF8 to GB2312 under linux system and handle BOM marks of UTF8" is over, hoping to solve everyone's doubts. Theory and practice can better match to help everyone learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.