In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article shows you how to understand the genome data analysis software SpeedSeq, which is concise and easy to understand, which will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.
SpeedSeq is an open source software for analyzing genomic data variation. Its main functions are as follows
Alignments, sequence alignment
Variant detection, mutation detection
Functional annotation, functional annotation of the mutation site
The most important feature of the software is its rapidness. For 50x human genome data, the original fastq to vcf file only takes about 13 hours. The corresponding article is published on nature methods. The link is as follows.
Http://ucgd.genetics.utah.edu/wp-content/uploads/2015/08/nmeth.3505.pdf
The software is a complete pipeline that integrates a variety of software and can be used to detect the following genomic mutations
Germline and somatic mutations, using freebayes software to detect mutant microsites.
Structural variants, using lumpy-sv software to detect structural variation
The flow chart is as follows
The source code is saved on github with the following link
Https://github.com/hall-lab/speedseq
According to the function, the software is divided into the following five sub-modules
1. Align
The module compares the double-terminal sequenced fastq data to the reference genome, and then performs markduplicate, sort, index and other steps, which is consistent with the data preprocessing steps in the GATK process.
Speedseq align\
-R "@ RG\ tID:sample1\ tSM:sample1\ tLB:sample1"\
-t 10\
-o sample1\
Hg19.fa\
Sample1_R1.fastq.gz\
Sample1_R2.fastq.gz
Bwa software was used to compare the reference genome, then samblaster was used for markduplicate, and sambamba software was used for sort of bam files.
2. Var
This module is used to detect reproductive variation and input the bam file generated for the align module. The usage is as follows
Speedseq var\
-t 10\
Hg19.fa\
Sample1.bam
Freebayes software was used to detect reproductive variation, and the output file was VCF file.
3. Somatic
This module is used to detect somatic mutations and input the bam file generated by the align module, using the following
Speedseq somatic\
-t 10\
-o tumor\
Hg19.fa\
Normal.bam\
Tumor.bam
Using freebayes software to detect somatic mutation, paired tumor and normal samples are needed, and the output file is VCF file.
4. Sv
This module is used to detect structural variation, and the usage is as follows
Speedseq sv\
-o sample\
-B sample.bam\
-D sample.discordants.bam\
-S sample.splitters.bam\
-R hg19.fa\
-o sample\
-t 10
Lumpy-sv software is used to detect structural variation, and the output file is VCF file.
5. Realign
This module extracts the double-ended fastq sequence from the bam file and performs the same processing as the align module. The usage is as follows.
Speedseq realign\
-t 10\
-o sample\
Hg19.fa\
Sample.ba
The bam file is required to contain read group information, and the output file is the same as the align module. For the analysis of genome-wide data, the use of speedseq can greatly accelerate the processing speed.
The above content is how to understand the genome data analysis software SpeedSeq, have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 217
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.