Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

A brief introduction to dbSNP Database

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains "A brief introduction to dbSNP database". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "A brief introduction to dbSNP Database".

There are many versions of dbsnp, and the latest version is 151. In this database, you need to understand the following two types of ID

NCBI Assay ID (ss)

Reference SNP ID (rs)

For each SNP locus submitted to the dbSNP database, a unique ss ID is first assigned. Due to the redundancy of the SNP submitted by different research structures, the sequences of the upstream and downstream regions of the SNP sites are extracted, and the reference genomes are compared. If multiple ss ID are aligned in the same position, it shows that these SNP sites are redundant and will be given a new reference SNP ID, starting with rs.

For each rsID, the database summary records the corresponding species, genotype, allele frequency, location, literature and other related information. Take rs1425711270 as an example, the link is as follows

Https://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=1425711270

First of all, we will give a comprehensive information, such as the species given in RefSNP, the version number of dbSNP database, etc.; the mutation type and base change information are given in Allele; and the mutation information specified according to HGVS naming rules is given by HGVS Names.

The remaining information is divided into multiple modules, each of which corresponds to different content. Look at the information of several major modules

1. Map

This part gives the location information of SNP loci on different versions of the genome. It can be seen that there is a great difference in location between hg19 and hg38.

2. Fasta

This part gives the sequence of SNP sites.

3. Ss ID

In this part, you can see multiple ssID corresponding to the rs number.

4. GeneView

This section gives the chromosomal and genetic information corresponding to SNP, as well as the effects on transcripts and proteins.

We often use the VCF file in the dbsnp database. Take human as an example, the download address is

Ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/

Common and All are provided. All contains all the SNP sites, and common contains only germ cell variation sites where MAF is greater than 0.01. usually download All.vcf.gz. Note that when downloading, download the corresponding md5 and tbi files, and md5 is used to check whether the downloaded file is complete. If the MD5 code of vcf.gz is inconsistent with the .md5 file, the download is incomplete; tbi file is the index of vcf file, which is convenient for gatk and other programs to read.

At this point, I believe you have a deeper understanding of the "brief introduction to dbSNP database". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report