In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article is about how to achieve quantitative analysis operation in featureCounts. The editor thinks it is very practical, so I share it with you. I hope you can get something after reading this article. Let's take a look at it with the editor.
FeatureCounts is integrated into subreads software, similar to the relationship between word and office. Subreads software also has a corresponding R packet Rsubreads.
FeatureCounts requires two input files:
Compare the resulting BAM/ SAM file
Interval comment file
For interval files, the following two formats are supported
GTF format
SAF format
The GTF format was described in detail in the previous article. Here's a look at the SAF format. Examples are as follows
GeneID Chr Start End Strand497097 chr1 3204563 3207049-497097 chr1 3411783 3411982-497097 chr1 3660633 3661579-
It is a five-column file separated by\ t, recording the interval and positive and negative chain information on the chromosome of the gene.
In featureCounts software, there are two core concepts:
Feature
Metafeature
Feature refers to the smallest unit of a genome, such as exon;, while metafeature can be seen as an interval of many feature, such as a combination of exons belonging to the same gene.
When quantifying, it supports the quantification of a single feature (exon quantification) and the quantification of meta-feature (quantification of genes).
When reads is compared to 2 or more features, by default, featureCounts ignores this part of reads in statistics. If you want to count this part of reads, you can add the-O parameter. In this case, a reads is compared to multiple feature, and each feature is quantified by 1. For meta-features If more than one features belongs to the same meta-features (for example, a reads is compared to an exon, but these exon belong to the same gene), only once will be counted for that gene.
In short, for both feature and meta-feature, only when comparing several different intervals will they be counted separately.
Features supports quantification of a single sample and normalization of multiple samples. The use of a single sample quantity is as follows
FeatureCounts-T 5\-t exon\-g gene_id\-an annotation.gtf\-o counts.txt\ mapping.sam
The use of normalization of multiple samples is as follows
FeatureCounts\-t exon\-g gene_id\-an annotation.gtf\-o counts.txt\ library1.bam library2.bam library3.bam
The interval comment file specified by-a parameter is in gtf format by default, and the-T parameter specifies the number of threads. By default, the parameter "1murt" specifies the name of the feature to be counted. The value range is the value of column 3 in the gtf file, and the default is exon. The-g parameter specifies the name of the meta-feature to be counted. The value range refers to the comment information in column 9 of gtf, and the ninth column of gtf is in the format of key=value. The possible values of the-g parameter are all key, and the default value is gene_id.
The content of the output expression file is as follows
# Program:featureCounts v1.6.0; Command: ". / featureCounts"-T "20"-t "" exon "- g"gene_id"-a "" hg19.gtf "- o"gene"accepted_hits.bam" Geneid Chr Start End Strand Length accepted_hits.bamDDX11L1 chr1;chr1;chr1 11874X 12613t12221 12227Ten 14409 +; +; + 1652 0
The comment line at the beginning of # records the running command; the line at the beginning of Geneid is the header, Geneid represents the name of the statistical meta-features, Chr,Start,End corresponds to the position on the chromosome, and Strand represents the positive and negative strands. Because a gene is composed of multiple exons, there are multiple chromosome position information here, which corresponds to the number of exons one by one.
Length the length of the interval, and the header of the last column is the name of your input file, which represents the expression of the meta-feature.
The above is how to achieve quantitative analysis operation in featureCounts. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.