Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use rmats to analyze variable cutting

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article is to share with you about how to use rmats for variable cutting analysis, the editor thinks it is very practical, so share it with you to learn, hope you can get something after reading this article, say no more, follow the editor to have a look.

Rmats is the most widely used variable shear analysis software at present, which can not only identify variable shear events, but also provide quantitative and inter-group difference analysis. The software has gone through several versions, and the latest version is v4.0.2. Compared with the previous version, the version after v4.0 has been optimized in terms of running speed, memory consumption, disk footprint and so on. The most obvious is the running speed, which is more than 100 times faster than before.

Installation is also very simple, directly download and decompress can be used, I will not repeat it here. Rmats can recognize the following five types of variable clipping events

The basic usage of the software is as follows

Python rmats.py\-- B1 b1.txt-- b2 b2.txt\-- gtf ref.transcript.gtf\-- od out_dir\-t paired\-- readLength 101\-- cstat 0.1\-- libType fr-unstranded

What is saved in b1.txt is the path of the bam file for each sample to compare the reference genome, as shown below

/ bams/rep1.bam,/bams/rep2.bam

This usage starts from the bam file, which is more practical. In addition, it is also supported to start with the fastq file. The usage is as follows

Python rmats.py\-- S1 s1.txt-- S2 s2.txt\-- gtf ref.transcript.gtf\-- bi / STARindex/hg19\-- od out_dir\-t paired\-- nthread 6\-- readLength 151

What is saved in S1.txt is the path of each sample fastq file. Rmats will automatically call STAR for comparison. The bi parameter specifies the index of the reference genome STAR. For more parameters and details, please refer to the official documentation.

The core functions of rmats are quantification and difference analysis, which are explained as follows

1. Quantitative analysis

Rmats uses exon inclusion level to define the expression of variable cut events in the sample. For example, exon hopping is called Exon Inclusion Isofrom for normal isoform, and Exon Skipping Isofrom for transcripts with exon jumping, as shown below.

If the reads on the inclusion isoform is represented by I and the reads on the skipping isoform is represented by S, the expression of the variable cutting event of the exon jump is as follows

As you can see, exon inclusion level is actually the proportion of inclusion isofrom, and the original reads number is corrected by length in the calculation. Other types of variable cut events can also be divided into the above two types of isoform, as shown in the following diagram

As you can see, rmats provides two ways to calculate the length of isofrom. The difference between the two is whether to consider the length of skipped exon. The detailed formula is also given in the figure above.

two。 Difference analysis

In the difference analysis, rmats compares the difference of inclusion level between the two groups of samples, and gives a threshold c to determine whether the corresponding inclusion level in the two samples has changed. The formula is as follows.

C this threshold is customized through the-- cstat parameter, with a value range of 0-1, which represents the difference between the inclusion level of the two samples, and 0.1 means that the inclusion level of the variable cut event in the two samples differs by 10%. Of course, the actual calculation process is very tedious, we need to consider the distribution of data, the corresponding statistical model and other factors, and finally give the p value of each variable shear event and the FDR value after multiple hypothesis test correction.

In the output directory, there are many files, and we can focus on two of them.

AS_Event.MATS.JC.txt

AS_Event.MATS.JCEC.txt

The AS_Event here corresponds to five different types of variable cut events, each of which is a separate file, while JC and JCEC correspond to the two calculation methods of isoform effective length. Since there is no absolute distinction between the two calculation methods, two results are given at the same time, and the quantitative and differential results are included in these documents.

IJC stands for inclusion isoform counts, SJC for skipping isoform counts, biological repetitive samples separated by commas; IncFormLen for effective inclusion isoform length, SkipFormLen for effective inclusion isoform length;lencLevel for quantitative results, InclevelDifference is the difference in expression between the two groups of samples, and the results can be filtered and screened by Pvalue and FDR.

In addition to the quantitative and differential results, the interval information of the exon corresponding to each variable cutting event is also given, as shown below.

The above example is the interval information in the exon jump. Other types of headers will be different, but the meaning is the same.

In rmats, variable shearing is identified in units of exon, and we only need to compare the expressions of three or four neighboring exon to determine whether a variable shearing event occurs. This idea starts from the core of variable shearing, that is, the change of exon, which is directly effective, but because of its high abstraction and simplification of the problem, the corresponding results do not look intuitive.

The above is how to use rmats for variable cutting analysis, the editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report