In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Bcftools csq how to analyze the impact of gene mutation on protein level, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.
The csq command not only analyzes the location of the SNP site on the genome, but also predicts the effect of gene mutations on the encoded proteins.
Unlike other software that predicts the effects of gene mutations on proteins, bcftools divides the genome into different independent regions (similar to the concept of haplotype regions). When analyzing protein changes, all mutation sites in this region are taken into account, as shown below.
In figure A, this region contains two SNP sites. If each site is considered separately, only amino acid substitution can be predicted, from arginine to tryptophan or glutamine. When the two SNP sites are considered comprehensively, the corresponding DNA sequence becomes a stop codon and the protein length changes.
In figure B, this region contains two indel loci. When each indel site is considered separately, the frameshift mutation occurs and the amino acid length changes. When considering the two SNP sites comprehensively, the amino acid changes are quite different from those when analyzing one site alone.
In the C diagram, two SNP sites occur on both sides of the shear site, and each SNP site is considered separately. The amino acid is replaced by aspartic acid to asparagine or glutamic acid. When considering the two mutation sites, the amino acid is replaced by asparagine to lysine.
From the diagram, it can be found that considering the effect of each SNP site on the protein separately, the result is biased, and only by comprehensively considering all the mutation sites in the adjacent range, the predicted protein change result is more reliable.
The csq runs the command as follows
Bcftools csq-f csq.fa-g csq.gff3 csq.vcf > csq.out
The-f parameter specifies the fasta file of the reference genome, the-g parameter specifies the gff3 file of the reference genome, csq.vcf is the input VCF file, and csq.out is the output file.
The output file is also in VCF format, and a BCSQ field is added to the INFO column to describe the location of the mutation site on the genome and changes in the protein sequence, as shown in the following example
BCSQ=synonymous | XYZ | ENST00000000001 | protein_coding | + | 1Y | 102C > T
The information of BCSQ consists of multiple fields, which are connected by the middle use of | connection, including the following fields
Consequence type
The types of effects of gene mutations on proteins, including synonymous, missense, inframe_deletion, etc.
Gene
Gene name
Transcript
Transcript name
Biotype
Gene type
Strand
Positive and negative chain information
Amino acid positon
The position of amino acids
Variants list
The set of mutation sites considered when predicting amino acid changes
Because bcftools comprehensively considers the joint effect of multiple mutation sites on protein, in actual analysis, we should filter out false positive mutation sites as much as possible, and then analyze the impact of protein level, such analysis results will be more reliable.
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.