In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
How to correctly use Annovar, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.
Download and install annovar
Annovar is written in Perl language and can be run on any system with perl installed without installation. It can be used by downloading and decompressing directly. But it needs to be registered for download and needs to use the mailbox with the suffix of an educational institution or scientific research institution. Of course, it doesn't matter if you don't have a registered mailbox, you can get the software installation package by replying to annovar at the background. There are three main forms of annotation for Annovar:
1. Gene-based annotation: judge whether it will cause changes in protein coding and amino acids according to the position of SNP or CNV.
2. Region-based annotation: to identify mutations in specific genomic regions.
3. Filter-based annotation: used to identify mutations in a specific database.
After downloading annovar and decompressing it, it mainly includes the following files:
Example: sample files are stored
Humandb: some of the annotated database files, some of which are included in annovar's software, can also be downloaded according to your own research needs.
Annotate_variation.pl: the main program for downloading the database and different forms of annotations
Coding_change.pl: used to infer whether the sequence of proteins has changed
Convert2annovar.pl: convert many other forms into annovar-aware forms (such as converting vcf files to annovar-recognized forms)
Retrieve_seq_from_fasta.pl: create transcripts of other species by yourself
Table_annovar.pl: three different forms of annotations can be done at once
Variants_reduction.pl: used to customize the process of filtering comments
-input file-
The input file of Annovar is a simple text format file, in which the first five columns should be the chromosome number, the starting position of the mutation site on the chromosome, the end position of the mutation site, the base of the mutation site on the reference sequence and the mutation base of the site, and the contents of other columns can be present or not.
If the input file is a vcf file, you can use annovar's convert2annovar.pl program to convert the vcf file into a file form recognized by annovar. The specific commands are as follows:
Perl convert2annovar.pl-format vcf4 G-001.vcf-outfile G.avinput
The format of the output file is:
-Database download-
Annovar's comments are mainly dependent on the database, so before analyzing, download the required database to the humandb folder with the following command:
Perl annotate_variation.pl-buildver hg19-downdb-webfrom annovar avsnp147 humandb/
-buildver: the version corresponding to the reference genome
-downdb-webfrom annovar: download the corresponding database from the annovar library. If you don't know what database to download, you can view the corresponding database and the corresponding functions in the annovar library. The URL is: (http://annovar.openbioinformatics.org/en/latest/user-guide/download/)
Avsnp147: the name of the downloaded database
Humandb: download to the humandb folder
-result notes-
After sorting out the input file format and downloading the database, you can make comments. Take table_annovar.pl as an example to introduce the annotation feature of annovar. The specific commands are as follows:
Perl table_annovar.pl GCK.avinput annovar/humandb/-buildver hg19-out GCK-remove-protocol refGene,1000g2015aug_eas,1000g2015aug_eur,1000g2015aug_sas,1000g2015aug_amr-operation gregory frecinct f, f-nastring.
Table_annovar.pl: import fil
-buildver: reference sequence version
-out: output file
-remove: delete the intermediate files generated during the process of running sequence d
-protocol: name of the database
-operation: the type of database in the corresponding order, such as Thousand Genome, dbsnp database, etc. (g represents gene-based, r represents region-based, f represents filter-based), which corresponds to the previous database one by one
-nastring.: used by default. Express
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.