In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
In this issue, the editor will bring you a difference analysis on how to use ballgown for transcript level. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.
There are two common strategies for transcriptome difference analysis, one is quantitative method based on raw count, such as DESeq2, edgeR, etc., and the other is quantitative method based on FPKM/RPKM, such as cuffdiff.
In previous articles, we also mentioned that pipeline based on FPKM values was upgraded from tophat+cufflinks+cuffdiff to hisat + stringTie + ballgown. The R packet of ballgown also analyzes the difference of the expression of FPKM value. There are two ways to get the FPKM value at transcript level.
1. StringTie
In order to facilitate the downstream ballgown analysis, the input file of ballgown can be generated by directly adding the-b parameter to the stringTie software. The basic usage is as follows
Stringtie-p 10\-G hg19.gtf\-o output.gtf\-b ballgown_out_dir-e\ align.sorted.bam2. Tablemaker
Tablemaker software can also generate input files for ballgown by calling cufflinks software, which can be downloaded from the following link
Https://figshare.com/articles/Tablemaker_Linux_Binary/1053137
The basic usage is as follows
Tablemaker\-p 4\-Q-W\-G hg19.gtf\-o out_dir\ align.sorted.bam
For each sample, a folder is generated containing the following five files
E_data.ctabe2t.ctabi2t.ctabi_data.ctabt_data.ctab
E stands for exon, I stands for intron, and t represents transcript,_data for different levels of expression. I2t represents the correspondence between intron and transcript, and e2t represents the correspondence between exon and transcript.
Once the input file is ready, you can do the difference analysis. Today's R packets are highly encapsulated, and a few functions can complete the whole set of analysis. The first step is to read the input files for all the samples, as follows
Library (ballgown) bg = ballgown (samples = c ("sampleA.dir", "sampleB.dir"), meas='all')
Samples specifies the input folder for all sample ballgown. After the import is successful, you can view the expression information of the sample at different levels in R through the * expr function. The value range of * is I, e, t, g, representing different levels.
An example of code to view expression at the transcript level is as follows
Transcript_fpkm = texpr (bg, 'FPKM')
It should be noted that the expression levels of intron, exon and transcript are all available in the original ctab file, while the expression level of gene needs to be calculated according to the expression of the corresponding transcripts of the gene, so it is more time-consuming.
After reading, you need to set the sample grouping as follows
PData (bg)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 204
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.