Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What kind of software is ANNOVAR

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail what kind of software ANNOVAR is. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

ANNOVAR is a mutation site annotation software, which provides multi-directional annotation functions and supports multiple species. It is one of the most popular annotation software.

The following six species are supported:

Huamn

Mouse

Worm

Fly

Yeast

Others

ANNOVAR is free to download for academic and non-profit organizations, and all you need to do is sign up for an account. After downloading, unzip it. The list of extracted files is as follows

├── annotate_variation.pl ├── coding_change.pl ├── convert2annovar.pl ├── example ├── humandb ├── retrieve_seq_from_fasta.pl ├── table_annovar.pl └── variants_reduction.pl

ANNOVAR is written in Perl and consists of a series of perl scripts. Annotate_variation.pl is the core script.

After installation, the first step is to download the relevant database, with the following command

Annotate_variation.pl-downdb-buildver hg19-webfrom annovar refGene humandb/

-downdb indicates that the purpose of the command is to download the database,-buildver specifies the genome version, default is hg18,-webform specifies the download link, default is ucsc, refGene represents the name of the downloaded database, and humandb represents the path to the database storage.

For a specified version of the reference genome, you can view all of its databases with the following command

Annotate_variation.pl-downdb-buildver hg19-webfrom annovar avdblist hg19_list/

The usage is the same as the first example, except that the name of the database is specified as avdblist. After a successful download, there will be

A corresponding avdblist.txt, some of which are as follows

Hg19_abraom.txt.gz 20180312 23198051hg19_abraom.txt.idx.gz 20180312 9837067hg19_AFR.sites.2012_04.txt.gz 20140106 277370590hg19_AFR.sites.2012_04.txt.idx.gz 20140106 22362560hg19_ALL.sites.2010_11.txt.gz 20140106 179456415hg19_ALL.sites.2010_11.txt.idx.gz 20140106 21944132hg19_ALL.sites.2011_05.txt.gz 20140106 232127231

Each line represents a database file.

Once the reference genome-related database is ready, you can annotate it.

The first step is to prepare the input file, which is available in two formats

1. Input

ANNOVAR custom format, separated by spaces or tabs, requires at least 5 columns, representing the chromosome, the starting position, the ending position, the reference genome base, the mutated base, and other columns as additional information. Examples are as follows

1 948921 948921 TC comments: rs15842, a SNP in 5 'UTR of ISG151 13211293 13211294 TC-comments: rs59770105, a 2-bp deletion1 11403596 11403596-AT comments: rs35561142, a 2-bp insertion2. VCF

The VCF format was introduced in the previous article, so I won't repeat it here. VCF is a standard format for catastrophe analysis, and most software supports this format of output.

There are only two formats that ANNOVAR can recognize. When you have snp calling results in other formats, you can use convert2annovar.pl for format conversion. For example, convert files in VCF and pileup formats to annovar input formats

Convert2annovar.pl-format pileup variant.pileup-outfile variant.queryconvert2annovar.pl-format vcf4 variantfile-outfile variant.avinput

The functions of ANNOVAR software can be divided into the following four categories

1. Gene-based annotation

Analyze the effect of mutation site on protein and support a variety of gene sets, including RefSeq, UCSC, ENSEMBL, GENCODE and so on.

2. Region-based annotation

Analyze whether the mutation site is located in a special region of the genome, such as transcription factor binding region, histone modification region and so on.

3. Filter-based annotation

Analyze whether the mutation site is located in the specified database, such as dbSNP, 1000G Magi ESP 6500, etc., and calculate SIFT,PolyPhen, LRT, MutationTaster, MutationAssessor, FATHMM, MetaSVM, MetaLR and other indicators.

4. Other functionalities

Small functions such as extracting sequences from the genome according to coordinates.

In the actual analysis, the annotation function of annovar is mainly used. As you can see, annovar provides three major types of annotations.

This is the end of the article on "what kind of software ANNOVAR is". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report