In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
How to use GNU Parallel correctly? In view of this problem, this article introduces the corresponding analysis and answers in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible way.
First of all, the following preparations need to be made:
Parallel > = version 20130814
Install the latest version:
(wget-O-pi.dk/3 | | curl pi.dk/3/) | bash
This command also installs the latest version of the guide
Man parallel_tutorial
Most of this tutorial is also compatible with older versions.
Abc-file
Generate the file:
Parallel-k echo: A B C > abc-file
Def-file
Generate the file:
Parallel-k echo: D E F > def-file
Abc0-file
Generate the file:
Perl-e'printf "A\ 0B\ 0C\ 0" > abc0-file
Abc_-file
Generate the file:
Perl-e'printf "Atom Bond C _"> abc_-file
Tsv_file.tsv
Generate the file:
Perl-e'printf "F1\ tf2\ nA\ tB\ nC\ tD\ n" > tsv-file.tsv
Num30000
Generate the file:
Perl-e'for (1.. 30000) {print "$\ n"}'> num30000
Num1000000
Generate the file:
Perl-e'for (1.. 1000000) {print "$\ n"}'> num1000000
Num_%header
Generate the file:
(echo% head1; echo% head2; perl-e'for (1.. 10) {print "$\ n"}') > num_%header
Remote execution: ssh password-free login to $SERVER1 and $SERVER2
Generate the file:
SERVER1=server.example.comSERVER2=server2.example.net
Finally, you should successfully run the following command:
Ssh $SERVER1 echo worksssh $SERVER2 echo works
Use ssh-keygen-t dsa; ssh-copy-id $SERVER1 to establish the environment (using empty pass phrase)
Input source
Input sources for GNU Parallel support files, command line, and standard input (stdin or pipe)
Single input source
Read input from the command line:
Parallel echo: A B C
Output (because the tasks are executed in parallel, the order may be different):
A
B
C
File as input source:
Parallel-an abc-file echo
Output is the same as above.
STDIN (standard input) as the input source:
Cat abc-file | parallel echo
Output is the same as above.
Multiple input sources
GNU Parallel supports specifying multiple input sources from the command line, and it generates all the combinations:
Parallel echo: A B C: D E F
Output:
A D
An E
A F
B D
B E
B F
C D
C E
C F
Multiple files as input sources:
Parallel-an abc-file-a def-file echo
Output is the same as above.
STDIN (standard input) can be used as one of the input sources, using "-":
Cat abc-file | parallel-a-- a def-file echo
Output is the same as above.
You can use "::" instead of-a:
Cat abc-file | parallel echo:-def-file
Output is the same as above.
: and: can be mixed:
Parallel echo: A B C: def-file
Output is the same as above.
Adaptation parameters
-xapply takes a parameter from each input source:
Parallel-xapply echo: A B C: D E F
Output:
A D
B E
C F
If one of the input sources is short in length, its value will be repeated:
Parallel-xapply echo: A B C D E: F G
Output:
A F
B G
C F
D G
E F
Change the parameter delimiter
GNU Parallel can specify delimiters instead of:: or:, which is especially useful when these two symbols are occupied by other commands:
Parallel-- arg-sep, echo, A B C: def-file
Output:
A D
An E
A F
B D
B E
B F
C D
C E
C F
Change the parameter delimiter:
Parallel-arg-file-sep / / echo: A B C / / def-file
Output is the same as above.
Change parameter delimiter
GNU Parallel defaults to a line as a parameter: use\ nas the parameter delimiter. You can use-d to change:
Parallel-d _ echo: abc_-file
Output:
A
B
C
\ 0 stands for NULL:
Parallel-d'\ 0' echo: abc0-file
Output is the same as above.
-0 is the abbreviation of-d'\ 0' (usually used from find … -print0 read input):
Parallel-0 echo: abc0-file
Output is the same as above.
Enter the end value in the source
GNU Parallel supports specifying a value as the closing flag:
Parallel-E stop echo: A B stop C D
Output:
A
B
Skip blank lines
Use-no-run-if-empty to skip blank lines:
(echo 1; echo; echo 2) | parallel-- no-run-if-empty echo
Output:
one
two
Build the command line
Not specifying a command means that a parameter is a command
If no command is given after parallel, these parameters are treated as commands:
Parallel: ls' echo foo' pwd
Output:
[current file list] foo [path to the current working directory]
The command can be a script file, a binary executable file, or a function of bash (the function must be exported with export-f):
# Only works in Bash and only if $SHELL=.../bashmy_func () {echo in my_func $1} export-f my_funcparallel my_func:: 1 2 3
Output:
In my_func 1
In my_func 2
In my_func 3
Replacement string
5 kinds of replacement strings
GNU Parallel supports a variety of replacement strings. {} is used by default:
Parallel echo: A/B.C
Output:
A/B.C
Specify {}:
Parallel echo {}: A/B.C
Output is the same as above
Remove the extension {.}:
Parallel echo {.}: A/B.C
Output
A/B
Remove the path {/}:
Parallel echo {/}: A/B.C
Output:
B.C
Keep only the path {/}:
Parallel echo {/ /}: A/B.C
Output:
A
Remove the path and extension {/.}:
Parallel echo {/.}: A/B.C
Output:
B
Output task number:
Parallel echo {#}: A/B.C
Output:
one
two
three
Change the replacement string
Use-I to change the replacement string symbol {}:
Parallel-I, echo,: A/B.C
Output:
A/B.C
-extensionreplace replaces {.}:
Parallel-- extensionreplace, echo,: A/B.C
Output:
A/B
-basenamereplace replaces {/}:
Parallel-- basenamereplace, echo,: A/B.C
Output:
B.C
-dirnamereplace replaces {/ /}:
Parallel-- dirnamereplace, echo,: A/B.C
Output:
A
-basenameextensionreplace replaces {/.}:
Parallel-- basenameextensionreplace, echo,: A/B.C
Output:
B
-seqreplace to replace {#}:
Parallel-- seqreplace, echo,: A B C
Output:
one
two
three
Specify position replacement string
If you have more than one input source, you can specify the parameters of one input source by {number}:
Parallel echo {1} and {2}: A B: C D
Output:
An and C
An and D
B and C
B and D
You can use /. And.: change the specified replacement string:
Parallel echo / = {1 /} / / = {1 / A/B.C D/E.F /. = {1}.}. = {1.}::
Output:
/ = B.C / / = A /. = B. = A _ A _ B
/ = E.F / / = D /. = E. = Dplink E
The position can be negative, indicating the inverted number:
Parallel echo 1 = {1} 2 = {2} 3 = {3}-1 = {- 1}-2 = {- 2}-3 = {- 3}:: AB: CD: EF
Output:
1 "A 2" C 3 "E-1" E-2 "C-3" A
1 "A 2" C 3 "F-1" F-2 "C-3" A
1 "A 2" D 3 "E-1" E-2 "D-3" A
1 "A 2" D 3 "F-1" F-2 "D-3" A
1 "B 2" C 3 "E-1" E-2 "C-3" B
1 "B 2" C 3 "F-1" F-2 "C-3" B
1 "B 2 D 3" E-1 "E" 2 "D-3" B
1B 2D 3F-1F-2D-3B
Enter by column
Use-colsep to cut the lines in the file into columns as input parameters. Use TAB (\ t) as follows:
1=f1 2=f2
1'A 2'B
1C 2D
Specify parameter name
Use-header to use the first value in each line of input as the parameter name:
Parallel-- header: echo F1 = {F1} f2 = {f2}:: F1 A B: F2 C D
Output:
F1 A f2C
F _ 1, F _ 2, D
F1B f2C
F 1B f 2m D
Use-colsep to process files that use TAB as the delimiter:
Parallel-- header:-- colsep'\ t 'echo F1 = {F1} f2 = {f2}: tsv-file.tsv
Output:
F _ 1, F _ 2, B
F 1 C f 2 D
Multiple parameters
-xargs allows GNU Parallel to support multiple parameters per line (upper limit can be specified):
Cat num30000 | parallel-- xargs echo | wc-l
Output:
two
The 30000 parameters are divided into two lines.
The upper limit of the number of parameters in a row is specified by-s. The maximum length specified below is 10000, which is divided into 17 lines:
Cat num30000 | parallel-- xargs-s 10000 echo | wc-l
For better concurrency, GNU Parallel distributes the parameters after the file has been read.
GNU Parallel starts the second task after reading the last parameter, which allocates all the parameters equally to four tasks (if four tasks are specified).
The first task is the same as the example above using-xargs, but the second task is equally divided into four tasks, resulting in a total of five tasks.
Cat num30000 | parallel-- jobs 4-m echo | wc-l
Output:
five
If you assign a 10-point parameter to 4 tasks, you can see it more clearly:
Parallel-- jobs 4-m echo: {1.. 10}
Output:
1 2 3
4 5 6
7 8 9
ten
The replacement string can be part of a word. Experience the difference between-m and-X through the following two commands:
Parallel-- jobs 4-m echo pre- {}-post:: A B C D E F G
Output:
Pre-A B-post
Pre-C D-post
Pre-E F-post
Pre-G-post
-X is the opposite of-m:
Parallel-- jobs 4-X echo pre- {}-post:: A B C D E F G
Output:
Pre-A-post pre-B-post
Pre-C-post pre-D-post
Pre-E-post pre-F-post
Pre-G-post
Use-N to limit the number of parameters per line:
Parallel-N3 echo: A B C D E F G H
Output:
A B C
D E F
G H
-N can also be used to specify a position replacement string:
Parallel-N3 echo 1 = {1} 2 = {2} 3 = {3}:: A B C D E F G H
Output:
1'A 2'B 3'C
1D 2E 3F
1 "G 2" H 3 =
-N0 reads only one parameter, but does not append:
Parallel-N0 echo foo: 1 2 3
Output:
Foo
Foo
Foo
Quote
If the command line contains special characters, it needs to be protected by quotation marks.
The perl script 'print "@ ARGV\ n' has the same function as linux's echo.
Perl-e'print "@ ARGV\ n" 'A
Output:
A
When you use GNU Parallel to run this command, the perl command needs to be wrapped in quotation marks:
Parallel perl-e'print "@ ARGV\ n": This wont work
Output:
[Nothing]
Use the-Q protect perl command:
Parallel-Q perl-e'print "@ ARGV\ n": This works
Output:
This
Works
You can also use':
Parallel perl-e\''print "@ ARGV\ n"'\':: This works, too
Output:
This
Works
Too
Use-quote:
Parallel-shellquoteparallel: Warning: Input is read from the terminal. Only experts do this on purpose. Press CTRL-D to exit.perl-e'print "@ ARGV\ n"'[CTRL-D]
Output:
Perl\-e\ 'print\ "@ ARGV\\ n\"\'
You can also use the command:
Parallel perl\-e\ 'print\\ "@ ARGV\\ n\"\':: This also works
Output:
This
Also
Works
Remove spaces
Use-trim to remove spaces at both ends of the parameter:
Parallel-- trim r echo pre- {}-post:'A'
Output:
Pre- A-post
Delete the space on the left:
Parallel-- trim l echo pre- {}-post:'A'
Output:
Pre-A-post
Delete the spaces on both sides:
Parallel-- trim lr echo pre- {}-post:'A'
Output:
Pre-A-post
Control output
Take the parameter as the output prefix:
Parallel-- tag echo foo- {}:: A B C
Output:
A foo-A
B foo-B
C foo-C
Modify the output prefix-tagstring:
Parallel-- tagstring {}-bar echo foo- {}:: A B C
Output:
A-bar foo-A
This is the answer to the question on how to use GNU Parallel correctly. I hope the above content can be of some help to you. If you still have a lot of doubts to solve, you can follow the industry information channel to learn more about it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.