In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/03 Report--
Iconv,enconv,enca,convmv,unix2dos,dos2unix file format conversion, od/cut/wc/dd/diff/uniq/nice/du and other commands
[Linux common tools] 1.1 three formats of diff commands
Abstract: 1. View the file encoding in Vim: set fileencoding can display the file encoding format. If you just want to view files in other encoding formats or want to solve the problem of viewing garbled files with Vim, you can add the following to the ~ / .vim rc file: set encoding=utf-8 fileencodings=ucs-bom,utf-8,cp936 so that vim can automatically recognize the file encoding (it can automatically recognize UTF-8
1. View file encodings in Vim: set fileencoding
The file encoding format can be displayed.
If you just want to view files in other encoding formats or want to solve the problem of garbled viewing files with Vim, you can use the
Add the following to the vim rc file: set encoding=utf-8 fileencodings=ucs-bom,utf-8,cp936
In this way, you can let vim automatically identify file encodings (you can automatically identify UTF-8 or GBK encoded files), which is actually an attempt according to the coding list provided by fileencodings. If you do not find a suitable encoding, open it with latin-1 (ASCII) coding.
Import: after set ff=unix, save the file.
2.vim file transcoding 1. Convert file encoding directly in Vim, such as converting a file to utf-8 format
: set fileencoding=utf-8
3.iconv file transcoding: iconv conversion, such as converting a UTF-8-encoded file to GBK encoding
Iconv-f GBK-t UTF-8 file1-o file2
Iconv-f GB2312-t UTF-8 test.txt-o test2.txt
Download address:
Ftp://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.8.tar.gz
4.enconv converts file encoding, for example, to convert a GBK-encoded file to UTF-8 encoding, as follows
Enconv-L zh_CN-x UTF-8 filename
Enconv-L GB2312-x UTF-8 test.txt
5.enca (if you do not have this command installed on your system, you can install it with sudo yum install-y enca) to view the file encodings
$enca filename
Filename: Universal transformation format 8 bits; UTF-8CRLF line terminators
It is important to note that enca does not recognize some GBK-encoded files very well, and it will appear during recognition:
Unrecognized encoding
6.convmv file name transcoding: copy files from Linux to windows or from windows to Linux
Sometimes the Chinese file name is garbled, and the reason for this problem is because
The Chinese coding of the file name of windows defaults to GBK, while the default file name of Linux is UTF8, because the encoding is inconsistent.
Therefore, it leads to the problem of garbled file name, and it is necessary to transcode the file name to solve this problem.
A special tool convmv is provided in Linux to convert file name encoding.
You can convert the file name from GBK to UTF-8 encoding, or from UTF-8 to GBK.
Yum-y install convmv
Here's a look at the specific use of convmv:
Convmv-f Source Encoding-t New Encoding [option] File name
Common parameters:
-r Recursive processing subfolders
-- notest actually operates, please note that by default, files are not actually manipulated, but only experimented with.
-list displays all supported encodings
-- unescap can do some escaping, such as turning% 20 into a space
For example, we have a file name encoded by utf8, which is converted to GBK encoding. The command is as follows:
Convmv-f UTF-8-t GBK-- notest utf8 encoded file name
In this way, after the conversion, the "utf8-encoded file name" will be converted to GBK encoding (only the conversion of the file name encoding, the file content will not change)
7. Unix2dosdos2unix conversion: use od-c-t x1 abc.txt to view special characters in the text file. DOS/Windows uses / rPlan as the end-of-line character, and Unix uses / n as the end-of-line character:
Unix2dos
< unix.txt >Dos.txt converts plain text files in Unix format into plain text files in DOS/Windows format
Dos2unix
< dos.txt >Unix.txt converts plain text files in DOS/Windows format into plain text files in Unix format
If you edit in openoffice, it is fully compatible. If you have symbols such as / M in vi, you can filter them out using tr or sed tools.
Text that wraps normally under Linux will no longer wrap when it comes to Windows.
When you wrap a line under Windows, there are two characters: carriage return (/ r) and line feed (/ n). But under Linux, there is only one newline (/ n)
Format conversion can be done using the unix2dos and dos2unix commands:
Parameters:
-k keep the date and time stamps of the output and input files unchanged
-o file default mode. Convert file and export it to file
-the new mode of n infile outfile. Convert infile and export to outfile
1. Unix2dos assumes that a new text file is created with vi, and type 123456 [root@centos test] # ls-l a.txt-rw-r--r-- 1 root root 7 Jan 7 21:31 a.txt [root@centos test] # hexdump-c a.txt 0000000 123456 / n 0000007 [root@centos test] # unix2dos-n a.txt b.txt unix2dos: converting file a.txt to file b.txt in DOS format. [root@centos test] # ls-l total 8-rw-r--r-- 1 root root 7 Jan 7 21:31 a.txt-rw- 1 root root 8 Jan 7 21:34 b.txt [root@centos test] # hexdump-c a.txt 0000000 1 234 56 / n 0000007 [root@centos test] # hexdump-c b.txt 0000000 1 2 34 56 / r / n 0000008 b.txt is the file under the converted DOS 2. Dos2unix [root @ centos test] # dos2unix-n b.txt c.txt dos2unix: converting file b.txt to file c.txt in UNIX format. [root@centos test] # ls-l total 12-rw-r--r-- 1 root root 7 Jan 7 21:31 a.txt-rw- 1 root root 8 Jan 7 21:34 b.txt-rw- 1 root root 7 Jan 7 21:38 c.txt [root@centos test] # hexdump-c b.txt 0000000 12 34 56 / r / n 0000008 [root@centos test] # hexdump-c c.txt 0000000 12 34 56 6 / n 0000007 c.txt is a text file under the converted unix
Od command users usually use the od command to view the contents of a file in a special format. You can display files in decimal, octal, hexadecimal, and ASCII codes by specifying different options for this command. Syntax: od [options] file... The meaning of the options in the command:-A specifies the address cardinality, including: d decimal o octal (system default) x hexadecimal n do not print displacement value-t specifies the display format of the data The main parameters are: C ASCII character or backslash sequence d signed decimal number f floating point number o octal (system default is 02) u unsigned decimal number x hexadecimal number except option c can be followed by a decimal number n to specify the number of bytes contained in each display value. Description: the default display mode of the od command system is octal, which is the origin of the name of the command (Octal Dump). But this is not the most useful way to display, the combination of ASCII code and hexadecimal can provide more valuable information output.
Od and hexdump display octal, hexadecimal, or other encoded bytes of the file content or stream. They are useful for accessing or visually checking for characters in a file that cannot be displayed directly on the terminal. S-w8 displays only 8 bytes per line: [tim@L gx] $od-Ad-tax1-w8 a.txt 0000000 1 2 34 5 6 cr nl 31 32 33 34 35 360 d 0a 0000008 a b c d e f cr nl 61 62 63 64 65 66 0d 0a 0000016 h e l l o, w o 68 65 6c 6f 2c 77 6f 0000024 r l d cr nl 72 6c 64 0d 0a-j2 characters output test content Skip the first two bytes: [tim@L gx] $od-Ad-tax1-j2 a.txt 0000002 34 5 6 cr nl a b c d e f cr nl 33 34 35 36 0d 0a 61 62 63 64 65 66 0d 0a 6 0000018 l l o, w o r l d cr nl 6c 6c 6f 2c 77 6f 72 6c 64 0d 0a 0000029-N2 displays only two bytes and displays in character form: [tim@L gx] $od-Ad-tax1-N2 a.txt 0000000 1 2 31 32 S
Use the wc command to improve the text content statistics: instruction name: wc method: wc [clw] file clearly: according to different options to calculate the number of words, words, lines, and so on. Please check the practical example by yourself 'man wc'. Example: count the number of files under the previous directory and use the command ls-l | wc-l ps: the number of parameters for this instruction is less than the previous one. Someone has used C language to implement the function of wc. You can also try it.
Use the sort command to sort the text content: instruction name: sort method: sort [- bcdfimMnr] [- o] [- t] [+ -] [--help] [--verison] [file] option solution: (for more information, please man sort yourself)-n: sort by number, number-r: sort in descending order-u: remove duplicates
Use the uniq command to view and delete the recopied columns of the text: instruction name: uniq syntax: uniq [option] file indicates that some of the characteristics of the lines in the text are shown. Option explanation: (for more information, please man uniq yourself)-c: add the number of occurrences of the line at the beginning of the line, the number of count writes. -d: only show duplicated lines-u: lines that do not repeat
Use the diff command to compare the text: instruction name: diff method: diff [option] file1 file2 states that diff performs the same operation on two files line by line. Optional explanation: (for more information, please man diff)-I: ignore the difference between uppercase and lowercase-b: ignore the difference between spaces
Use the du command to count the magnetic space occupied by directories or files: instruction name: du method: du [options] Directory or file options solution: (for more information, please man du)-k/m/g: size in kb, mb, gb du-S | sort-n list the directories that occupy the largest space-sh: only view specified directories, not subdirectories
Use the cut command to extract the desired data: instruction name: cut method: cut [option] file usage:-b: intercept the word-c: intercept the character cut-C1-15 means to intercept the content from column 1 to column 15, cut-C1-4, 8-to intercept content from column 1 to column 4, and content from column 8 to column 4-f: intercept the field cut-F1-content from column 1 to column 1 The cut-F1-- s that is intercepted as a delimiter indicates that the delimiter in the intercept is the Tab-separated text ps: when intercepting Chinese, you should pay attention to that the Chinese characters are made up of two English characters.
Use the dd command to test magnetic speed and create new files: instruction name: dd directive states: from the specified location copy data to the specified export location application practice: bs specifies the size of each operation, count specifies the number of operations to build 2m-sized files. # dd if=/dev/zero of=/home/test/2M.txt bs=1024 count=2048 is the same as testing the magnetic replication speed # dd if=/dev/zero of=/home/rwspeed.ret bs=1024 count=1048576 replication system # dd if=/home/test/my_fiter of=/ there is also a command to create files of a specified size on the bs=512 count=256 ps:windows plane of other appliances, which is fsutil.
Use the nice command to verify the priority of program execution: instruction name: nice instruction clearly states that the first level of the integration process applies: the priority level of the Linux process is from-20 to + 20. The smaller the number, the higher the priority, that is, the more time it takes up CPU. General use can only lower the priority of the program, while root can increase / lower the priority of the process. # nice check the default priority # nice. / a.out default execution, and add 10 priority levels to a.out, that is, allocate less cpu time. # nice-n-20 a.out is the highest priority Unix/Linux for a.out. There are a lot of commands above, which is the wisdom of many people and programmers all over the world. Proficient in mastering and using the commands provided by the system will often get half the result with twice the effort. Only a few of them are listed here. For other commands, you can refer to the introduction of the website, or find the profile of this website.
Iconv,enconv,enca,convmv,unix2dos,dos2unix file format conversion, od/cut/wc/dd/diff/uniq/nice/du and other commands
Recommended article
2 detailed explanation of linux sort,uniq,cut,wc command
3 detailed explanation of linux sort,uniq,cut,wc command
4 detailed explanation of linux sort,uniq,cut,wc command
5. Detailed explanation of linux sort,uniq,cut,wc command
6. Detailed explanation of linux sort,uniq,cut,wc command
7 detailed explanation of linux sort,uniq,cut,wc command
8 linux sort,uniq,cut,wc command detailed explanation
9 detailed explanation of linux sort,uniq,cut,wc command
10 linux sort,uniq,cut,wc command detailed explanation
11 linux sort,uniq,cut,wc command detailed explanation
12 linux sort,uniq,cut,wc command detailed explanation
13 detailed explanation of linux sort,uniq,cut,wc command
14 linux command summary (wc, cut, sort,
15 linux sort,uniq,cut,wc command detailed explanation
16 linux sort,uniq,cut,wc command detailed explanation
17. Convert DOS files to UNIX file format: dos2
18 text File Command (wc,cut,sort,uniq)
19 text File Command (wc,cut,sort,uniq)
Detailed explanation of 20 linux sort,uniq,cut,wc command
1 format of diff file
Diff reference http://en.wikipedia.org/wiki/Diff diff is a comparison tool that compares two files and then outputs two texts
2 shell fundamentals 11: file classification, merging and
Shell Basics 11: file Classification, merging and Partition (sort,uniq,join,cut,paste,split)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.