Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to install and configure soapdenovo2

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "how to install and configure soapdenovo2". In daily operation, I believe many people have doubts about how to install and configure soapdenovo2. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubts about "how to install and configure soapdenovo2". Next, please follow the editor to study!

Soapdenovo is an assembly tool developed by Huada, which is mainly used for the assembly of large genomes such as animal and plant genomes, as well as bacterial / fungal genomes. For large-scale gene assembly, a lot of hardware resources are needed, and it is recommended that there are more than 150G in it.

The installation process is as follows

Wget https://github.com/aquaskyline/SOAPdenovo2/archive/r241.tar.gztar xzvf r241.tar.gzcd SOAPdenovo2-r241/make

After the compilation is successful, the following three executable files are generated

SOAPdenovo-63mer

SOAPdenovo-127mer

SOAPdenovo-fusion

The first two executable files are used for assembly. The maximum kmer length supported by 63mer is 63127mer, and the maximum supported kmer length is 127. except for different supported kmer lengths, other uses are complete.

The same.

SOAPdenovo consists of the following subcommands

Pregraph

Sparse_pregraph

Contig

Map

Scaff

All

The first five subcommands correspond to the five steps of soapdenovo assembly, and the all command indicates that you can execute more than five steps at a time; during assembly, you can either execute each step in turn or directly use the all command to run all the steps at once.

Soapdenovo requires a configuration file, which is divided into two parts, the global configuration and the configuration of each library. The global configuration currently has only one parameter max_rd_len, and if the sequence is larger than that length, it will be cut to that length and then analyzed.

The configuration of each library begins with [LIB], which mainly specifies the path to the input file and supports multiple formats of input file, expressed with different prefixes, Q represents the input sequence in fastq format; f ghting input sequence is in fasta format, b represents the input file in bam format, and for double-ended data, the suffixes 1 and 2 are used to represent the reads of R1 and R2, respectively.

In addition to entering the file path, it also includes settings for the following parameters

Avg_ins

For the average length of the inserted fragments in the library, you can refer to the size distribution map of the library and take the peak value when setting it.

Reverse_seq

Whether it is necessary to reverse complement the sequence, for pair-end data, no reverse complementarity is required, set to 0; for mate-pair data, reverse complementarity is required, set to 1

Asm_flags

1 means that only contig is assembled. 2 means only assembling scaffold,3 means assembling contig and scaffold,4 at the same time means only complement gap

Rd_len_cutof

Sequence length threshold, which has the same effect as max_rd_len, and sequences larger than this length will be removed to that length.

Rank

Set the priority order of different library data, the range of values is an integer, and multiple libraries with the same rank value will be used at the same time when assembling scaffold.

Pair_num_cutoff

The minimum number of overlap before contig or scaffold. The default value is 3 for pair-end data and 5 for mate-paird data.

Map_len

The minimum threshold for comparing length, the default is 32 for pair-end data and 35 for mate-pair data

Examples of configuration files are as follows

Max_rd_len= 100[LIB] avg_ins=200reverse_seq=0asm_flags=3rd_len_cutoff=100rank=1q1=fastq1_read_1.fqq2=fastq1_read_2.fq

The basic usage of the software is as follows

SOAPdenovo-63mer all-s config_file-K 63-R-o graph_prefix

After a successful run, a number of files are generated, two of which are the result of assembly, with suffixes contig and scafSeq, corresponding to contig and scaffold.

At this point, the study on "how to install and configure soapdenovo2" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report