In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
How to use ASProfile to analyze variable cutting events, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.
ASprofile is a software that identifies variable splicing events, which can directly compare multiple transcripts of the same gene to identify variable splicing events. The installation of the software is relatively simple, download and decompress it. The basic usage is as follows
Extract-as\ transcript.gtf\ ref.fa.hdrs > as_events.txt
The script needs two parameters, the first parameter is the gtf file corresponding to the transcript, in the actual analysis, the transcript sequence is assembled from the sequencing data by cufflinks or stringTie, then the assembled transcript is combined with the known transcript to remove redundancy, and the non-redundant transcript sequence after merge is used as input; the second parameter is the genome length statistics file with the suffix hdrs, which is as follows
> chr1 / len=249250621 / nonNlen=225280621 / org=Homo_Sapiens (hg19) > chr2 / len=243199373 / nonNlen=238204518 / org=Homo_Sapiens (hg19) > chr3 / len=198022430 / nonNlen=194797135 / org=Homo_Sapiens (hg19)
Each row represents a chromosome, giving the total length, the length after removing the N base, and species information. The detailed results of different variable cut events are given in the resulting file. The variable cut types in Asprofile are defined as follows
1. Exon jump
The definition of exon jump is as follows
The transcripts before and after exon hopping are denoted by on and off respectively, and the X prefix indicates that the exon boundary is not accurately paired, which is a few bp less than the previous exon.
The jump of a single exon is called exon skipping, which is represented by SKIP, as shown below.
The jump of multiple exons is called cassette exons, which is represented by MSKIP, as shown below.
two。 Intron retention
Intron retention is defined as follows
The transcripts before and after intron retention are represented by off and on respectively, and the X prefix indicates the imprecise pairing of exon boundaries, which is a few bp worse than the previous exon.
The retention of a single intron is called retention of single intron, which is represented by IR and is illustrated as follows
Multiple intron retention is called retention of multiple introns, which is represented by MIR and is illustrated as follows
3. Exon substitution
Exon replacement is called alternative exon, which is represented by AE, as shown below
It includes all kinds of situations, such as the 5 'end of exon remains unchanged and the 3' end changes, as shown below
The 3 'end of the exon remains unchanged, while the 5' end changes, as shown below
The 3 'and 5' ends of exon change at the same time, as shown below
4. Substitution of transcriptional initiation site
The replacement of the transcriptional initiation site is called alternative transcript start and is represented by TSS, as shown below
5. Replacement of transcriptional termination sites
The replacement of the transcriptional initiation site is called alternative transcript termination, which is represented by TTS, similar to TSS, except that the position of the 3 'end has changed, as shown below
The variable cutting events in the above file are displayed in transcripts, each line represents a transcript, and there is redundancy. When we want to know the type and number of variable cuts that occur on a gene, the file is not intuitive enough. The official summarize_as.pl script provides a summary of the non-redundant variable cutting events and the variable cutting events of each gene.
Perl summarize_as.pl\ transcript.gtf\ as.events.txt\-p prefix
The script generates two files, with non-redundant variable cut events in the file with the suffix nr, and the type statistics for each gene variable cut in the file with the suffix summary, as shown below
Through Asprofile, you can directly compare multiple transcripts of the same gene to identify different variable cutting events. In addition, Asprofile also provides a quantitative function to calculate the fpkm value. Through the collect_fpkm.pl script, you can summarize the variable cutting results of multiple samples, as follows
Perl collect_fpkm.pl sampleA.AS,sampleB.AS-s txt
Multiple samples are connected by commas,-s specifies the suffix of the corresponding file, and the corresponding file is identified by the sample name plus the suffix. The script will give the proportion of each variable cut event in the sample, based on which we can analyze the difference. For more usage, please refer to the official instructions and the help documentation of the script.
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.