In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article analyzes "what the practice of Shell scripting is like". The content is detailed and easy to understand, and friends who are interested in "what is the practice of Shell script programming" can follow the editor's train of thought to read it in depth. I hope it will be helpful to everyone after reading. Let's learn more about "what the practice of Shell scripting is like" with the editor.
Code style specification
It starts with a "snake stick."
The so-called shebang actually appears in the first line of many scripts with #! At the beginning of the comment, he indicates the default interpreter when we do not specify an interpreter, which may be as follows:
#! / bin/bash
Of course, there are many kinds of interpreters, except for bash, we can check the natively supported interpreters with the following command:
$cat / etc/shells#/etc/shells: valid login shells/bin/sh/bin/dash/bin/bash/bin/rbash/usr/bin/screen
When we directly use. / a.sh to execute this script, if there is no shebang, it will default to the interpreter specified by $SHELL, otherwise it will use the interpreter specified by shebang.
This is the way we recommend it.
The code has comments.
Comments are obviously common sense, but again, this is especially important in shell scripts. Because many single-line shell commands are not so easy to understand, it can be especially big to maintain without comments.
The meaning of comments is not only to explain the purpose, but to tell us what to pay attention to, just like a README.
Specifically, for shell scripts, comments generally include the following sections:
Shebang script parameters script usage script precautions script writing time, author, copyright and other functions before notes some more complex one-line command comment parameters should be standardized
This is very important, when our script needs to accept parameters, we must first determine whether the parameters are in line with the specification, and give an appropriate echo to facilitate users to understand the use of parameters.
At least, we have to at least judge the number of parameters:
If [[$#! = 2]]; then echo "Parameter incorrect." Exit 1fi
Variables and magic numbers
In general, we define some important environment variables at the beginning to ensure the existence of these variables.
Source / etc/profileexport PATH= "/ usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/apps/bin/"
This definition has a very common use, and the most typical application is that when we have many versions of java installed locally, we may need to specify a java to use. At this point, we will redefine the JAVA_HOME and PATH variables at the beginning of the script for control. At the same time, a good piece of code usually does not have many "magic numbers" hard-coded in the code. If it has to be, it is usually defined at the beginning in the form of a variable, and then the variable is called directly when it is called, which is convenient for future modification.
There are rules for indenting
For shell scripts, indentation is a big problem. Because many areas that need to be indented (such as if,for statements) are not long, many people are lazy to indent, and many people are not used to using functions, resulting in the weakening of indentation.
In fact, the correct indentation is very important, especially when writing functions, otherwise it is easy to confuse the function body with directly executed commands when reading.
The common indentation methods are "soft tab" and "hard tab".
The so-called soft tab is to use n spaces for indentation (n is usually 2 or 4) the so-called hard tab of course refers to the real\ t character which is the best way not to tear, it can only be said that each has its own advantages and disadvantages. Anyway, I'm used to using hard tab. For things like if and for statements, we'd better not write the keywords then,do on a separate line, which looks ugly. There is a standard for naming.
The so-called naming convention basically includes the following points:
File name specification, ending with .sh, easy to identify variable names should have meaning, do not misspell uniform naming style, write shell generally use lowercase letters and underscores to code uniformly
Try to use UTF-8 coding when writing scripts, which can support some strange characters such as Chinese. However, although I can write Chinese, I still try to write in English when writing comments and typing log. After all, many machines still do not support Chinese directly, and there may be garbled codes when typed. It is also important to note that when we write a shell script with utf-8 coding under windows, we must pay attention to whether the utf-8 has BOM. By default, windows determines the utf-8 format by adding three EF BB BF bytes to the beginning of the file, but there is no BOM by default in Linux. So if we are writing scripts under windows, we must be careful to change the code to Utf-8 without BOM, which can be changed with editors such as notepad++. Otherwise, the first three characters will be recognized when running under Linux, reporting some errors that do not recognize the command. Of course, another common problem with cross-platform scripting is the difference in newline characters. Windows defaults to\ r\ nwhile under unix it is\ n. But there are two gadgets that can solve this problem very conveniently: dos2unix,unix2dos.
Remember to add permissions.
Although this is very small, I often forget that it is a bit annoying that no enforcement authority will lead to direct execution.
Log and echo
Needless to say, the importance of the log can facilitate us to go back and correct errors, which is very important in large projects.
If this script is for the user to use directly on the command line, then we'd better be able to echo the execution process in real time so that the user can control it.
Sometimes in order to improve the user experience, we will add some special effects to the echo, such as color, flicker and so on. Please refer to the introduction of this article in ANSI/VT100 Control sequences.
Password to be removed
Don't hard-code passwords in scripts, don't hard-code passwords in scripts, don't hard-code passwords in scripts.
Say the important thing three times, especially when the script is hosted on a platform like Github.
It's too long to have a branch.
When calling some programs, the parameters may be very long, so in order to ensure a better reading experience, we can use the backslash to branch:
. / configure\-prefix=/usr\-sbin-path=/usr/sbin/nginx\-conf-path=/etc/nginx/nginx.conf\
Notice that there is a space before the backslash.
Coding detail specification
Code efficiency
When using the command, we should understand the specific practice of the command, especially when the amount of data processing is large, we should always consider whether the command will affect the efficiency.
For example, the following two sed commands:
Sed-n'1p 'filesed-n' 1p Exchange 1q' file
They all serve the same purpose as getting the first line of the file. But the first command reads the entire file, while the second command reads only the first line. When the file is large, just one different command can make a huge difference in efficiency.
Of course, this is just to give an example, the really correct use of this example should be to use the head-N1 file command.
Frequently use double quotation marks
Almost all bosses recommend that it is best to use "$" to get variables in double quotes.
Not adding double quotation marks can cause a lot of trouble in many cases. Why? To give an example:
#! / bin/sh# knows that the current folder has an a.sh file var= "* .sh" echo $varecho "$var"
The results of his operation are as follows:
A.sh*.sh
Why is this happening? In fact, it can be explained that he carried out the following orders:
Echo * .shecho "* .sh"
In many cases, when using variables as parameters, be sure to pay attention to the above and carefully understand the differences. The above is just a very small example, and there are too many problems caused by this detail in practical application.
Skillful use of main function
We know that compiled languages like java,C have a function entry, and this structure makes the code very readable, and we know which ones are executed directly and which are functions. But the script is different, the script is an interpretive language, executed directly from the first line to the last line, in which commands and functions are mixed together, it is very difficult to read.
Friends who use python know that a standard python script generally looks like this:
#! / usr/bin/env pythondef func1 (): passdef func2 (): passif _ name__=='__main__': func1 () func2 ()
He uses an ingenious way to implement the main function we are used to, making the code more readable.
In shell, we have a similar tip:
#! / usr/bin/env bashfunc1 () {# do sth} func2 () {# do sth} main () {func1 func2} main "$@"
We can use this method of writing and also implement similar main functions to make the script more structured.
Consider scope
The default variable scope in shell is global, such as the following script:
#! / usr/bin/env bashvar=1func () {var=2} funcecho $var
His output is 2 instead of 1, which is obviously not in line with our coding habits and can easily cause some problems.
Therefore, instead of using global variables directly, we'd better use commands such as local readonly, and then we can use declare to declare variables. These approaches are better than using a global definition.
Function return value
When using a function, it must be noted that the return value of a function in shell can only be an integer. It is estimated that in general, the return value of a function usually represents the running state of the function, so it is usually 0 or 1 is enough, so it is designed like this. However, if you have to pass a string, you can also use the following workarounds:
Func () {echo "2333"} res=$ (func) echo "This is from $res."
In this way, you can pass some extra parameters through things like echo or print.
Indirect reference value
What is indirect reference? For example, the following scenario:
VAR1= "2323232" VAR2= "VAR1"
We have a variable VAR1 and a variable VAR2, and the value of this VAR2 is the name of VAR1, so what should we do now if we want to get the value of VAR1 through VAR2?
The way to compare hillbilly is as follows:
Eval echo\ $$VAR2
What do you mean? In fact, it is to construct a string echo XXX, this XXX is XXX ", this XXX is the value VAR1 of VAR2, and then use eval to force parsing, so that the value is taken in disguise.
This usage is indeed feasible, but it looks very uncomfortable and difficult to understand intuitively, and we do not recommend it. And in fact, we don't recommend using the eval command ourselves.
The more comfortable way to write it is as follows:
Echo ${! VAR1}
By adding one to the variable name! You can do a simple indirect reference.
However, it should be noted that with the above method, we can only take values, not assign values. If you want to assign values, you have to honestly use eval to deal with it:
VAR1=VAR2eval $VAR1=233echo $VAR2
Skillful use of heredocs
The so-called heredocs can also be regarded as a method of multi-line input, that is, in "
Using heredocs, we can easily generate some template files:
Cat > > / etc/rsyncd.conf local/logs/rsyncd.logtransfer logging = yeslog format = t% a% m% f% bsyslog facility = local3EOF
Learn to look up the path
In many cases, we will first get the path of the current script, and then use this path as a benchmark to find other paths. Usually we use pwd directly to get the path to the script.
However, this is not rigorous. Pwd gets the execution path of the current shell, not the execution path of the current script.
The right thing to do is to do the following:
Script_dir=$ (cd $(dirname $0) & & pwd) script_dir=$ (dirname $(readlink-f $0))
You should cd into the directory of the current script and then pwd, or read the path of the current script directly.
Keep the code short.
Brevity here refers not only to the length of the code, but also to the number of commands used. In principle, we should do so that problems that can be solved by one order can never be solved by two orders. This is not only related to the readability of the code, but also related to the efficiency of code execution.
The most classic examples are as follows:
Cat / etc/passwd | grep rootgrep root / etc/passwd
The most contemptible use of the cat command is this. It doesn't make any sense. Obviously, a command can solve it, but he has to add a pipe.
In fact, code brevity can also improve efficiency to some extent, such as the following example:
# method1find. -name'* .txt'| xargs sed-I s/233/666/gfind. -name'* .txt'| xargs sed-I s/235/626/gfind. -name'* .txt'| xargs sed-I s/333/616/gfind. -name'* .txt'| xargs sed-I s/233/664/g#method1find. -name'* .txt'| xargs sed-I "sAccord 233xxxxxxt'| xargs sed-I" sUnix 233xxt'|
Both methods do the same thing by finding all the files with the .txt suffix and making a series of replacements. The former executes find multiple times, while the latter executes find once, but adds the pattern string of sed. The first is more intuitive, but when the amount of replacement becomes larger, the second is much faster than the first. The reason for the efficiency improvement here is that the second one only needs to execute the command once, while the first one has to be executed many times. Moreover, with the clever use of the xargs command, we can also easily parallelize:
Find. -name'* .txt'| xargs-P $(nproc) sed-I "sUnix 233Universe 666 Universe 235 Universe 626 Greater Greater 333 Greater 616 Greater Signor 233 Greater 664 Greg"
The execution efficiency can be further accelerated by specifying the degree of parallelism through the-P parameter.
Command parallelization
When we need to fully consider execution efficiency, we may need to consider parallelization when executing commands. The simplest parallelization in shell is done through "&" and "wait" commands:
Func () {# do sth} for
Of course, the number of parallelism here should not be too much, otherwise the machine will get stuck. The slightly correct approach is more complicated, and we'll talk about it later. If it's easy, you can use the parallel command, or use the xargs mentioned above.
Full-text retrieval
We know that when we want to retrieve a string (such as 233in all the txt files under the folder), we might use a command like this:
Find. -name'* .txt'- type f | xargs grep 2333
In many cases, this command will find the corresponding matching line as we want, but we need to pay attention to two minor problems.
The find command matches the required file name, but if the file name contains spaces, there will be a problem when passing the file name to grep. The file will be treated as two parameters. At this time, you need to add a layer to ensure that the file names separated by spaces will not be treated as two parameters:
Find. -type f | xargs-I echo'"{}"'| xargs grep 2333
Sometimes, the character set of the file may not be consistent with the character set of the terminal, which will lead to problems such as binary file matches when grep searches the file as a binary file. At this point, you can either use a character set conversion tool such as iconv to switch the character set, or add the-a parameter to grep without affecting the lookup, and treat all files as text files:
Find. -type f | xargs grep-a 2333
Use a new way of writing
The new way of writing here does not mean how powerful it is, but that we may prefer to use some of the newly introduced syntax, more code-style, such as
Try to use func () {} to define functions instead of func {}
Try to use [[]] instead of []
Try to use $() to assign the result of the command to variables instead of backquotes
Try to use printf instead of echo for echo in complex scenes
In fact, many of these new writing methods are more powerful than the old ones, and you'll know when you use them.
Other small tip
Considering that there are still many piecemeal points, we will not expand them one by one. I would like to mention them briefly here.
Keep the absolute path as far as possible. Many paths are not easy to make mistakes. If you have to use the relative path, you'd better use. / modifier.
Give priority to the variable substitution of bash instead of awk sed, which is shorter
Try to use & & | | for simple if, write it on a single line.
For example, [[x > 2]] & & echo x
When export variables, try to add the namespace of the subscript to ensure that the variables do not conflict
The trap is used to capture the signal and perform some finishing work when the termination signal is received
Use mktemp to generate temporary files or folders
Use / dev/null to filter unfriendly output information
Will use the return value of the command to judge the execution of the command.
Judge whether the file exists before using the file, otherwise do a good job of exception handling
Do not deal with the data after ls (such as ls-l | awk'{print $8}'). The result of ls is very uncertain and platform-dependent.
Don't use for loop when reading files, use while read instead
When you use the cp-r command to copy a folder, note that if the destination folder does not exist, it will be created, and if it does, it will be copied to a subfolder of the file.
Overview of the static check tool shellcheck
In order to ensure the quality of scripts in terms of system, our simplest idea is to create a static checking tool to make up for the knowledge blind spots that developers may have by introducing tools.
There are not many static inspection tools for shell on the market. Look around and find a tool called shellcheck. Open source on github, there are more than 8K star, which looks very reliable. We can go to his home page for specific installation and use information.
Installation
This tool has great support for different platforms, and it at least supports the mainstream package management tools for various platforms, such as Debian,Arch,Gentoo,EPEL,Fedora,OS XJI OpenSUSE. Easy to install. Please refer to the installation documentation for details.
Integration
Since it is a static checking tool, it must be integrated into the CI framework, and shellcheck can be easily integrated into Travis CI for static checking of projects based on shell scripts.
Sample
In the Gallery of bad code of the document, it also provides a very detailed standard of "bad code", which has a very good reference value, and it is very comfortable to read it as a book such as "Java Puzzlers" in your spare time.
On how the practice of Shell script programming is shared here, I hope that the above content can make you improve. If you want to learn more knowledge, please pay more attention to the editor's updates. Thank you for following the website!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.