In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the relevant knowledge of "the method of R-package GSVA enrichment analysis". The editor shows you the operation process through an actual case, and the operation method is simple, fast and practical. I hope this article "the method of R-package GSVA enrichment analysis" can help you solve the problem.
Usage: Rscript.. / scripts/ssgsea_enrich_diff.r-husage:.. / scripts/ssgsea_enrich_diff.r [- h]-g GMTFILE-I EXPR-m META [- n GROUP_NAME] [--log2] [- t method] [- k kcdf] [--group1 GROUP1] [--group2 GROUP2] [- p PVALUECUTOFF] [--no_diff] [- o OUTDIR] [- f PREFIX] Gene set variation analysis (GSVA): https://www..com/article/1586optional arguments:-h -- help show this help message and exit-g GMTFILE,-- gmtfile GMTFILE GSEA gmtfile function class file [required]-I EXPR,-- expr EXPR Input gene expression file path [required]-m META,-- meta META Input the clinical information file path that contains the grouping [required]-n GROUP_NAME Group_name GROUP_NAME Specifies the column name that contains grouping information [optional,default:m6acluster]-- log2 Whether to perform log2 processing [optional,default:False]-t method,-- method method Method to employ in the estimation of gene-set enrichment scores per sample. By default this is set to gsva (H ä nzelmann et al, 2013) and other options are ssgsea (Barbie et al, 2009), zscore (Lee et al, 2008) or plage (Tomfohr et al, 2005). The latter two standardize first expression profiles into z-scores over the samples and, in the case of zscore, it combines them together as their sum divided by the square-root of the size of the gene set While in the case of plage they are used to calculate the singular value decomposition (SVD) over the genes in the gene set and use the coefficients of the first right- singular vector as pathway activity profile [default gsva]-k kcdf Kcdf kcdf Character string denoting the kernel to use during the non-parametric estimation of the cumulative distribution function of expression levels across samples when method= "gsva". By default, kcdf= "Gaussian" which is suitable when input expression values are continuous, such as microarray fluorescent units in logarithmic scale, RNA-seq log- CPMs, log-RPKMs or log-TPMs. When input expression values are integer counts, such as those derived from RNA-seq experiments, then this argument should be set to kcdf= "Poisson" [default Gaussian]-group1 GROUP1 Designate the first group [optional,default C1]-group2 GROUP2 Designate the second group [optional,default C2]-p PVALUECUTOFF -- pvalueCutoff PVALUECUTOFF pvalue cutoff on enrichment tests to report [optional,default:0.05]-- no_diff No screening was performed based on the difference analysis results [optional,default:False]-o OUTDIR,-- outdir OUTDIR output file directory [optional,default cwd]-f PREFIX -- prefix PREFIX out file name prefix [optional,default kegg] parameter description:
-g reference gene set, such as KEGG gene set c2.cp.kegg.v6.2.symbols.gmt downloaded from MSigDB
KEGG_GLYCOLYSIS_GLUCONEOGENESIS
Http://www.broadinstitute.org/gsea/msigdb/cards/KEGG_GLYCOLYSIS_GLUCONEOGENESIS
ACSS2
GCK
KEGG_CITRATE_CYCLE_TCA_CYCLE
Http://www.broadinstitute.org/gsea/msigdb/cards/KEGG_CITRATE_CYCLE_TCA_CYCLE
IDH3B
DLST
KEGG_PENTOSE_PHOSPHATE_PATHWAY
Http://www.broadinstitute.org/gsea/msigdb/cards/KEGG_PENTOSE_PHOSPHATE_PATHWAY
RPE
RPIA
-I gene expression matrix
ID
TCGA-A3-3319-01A-02R-1325-07
TCGA-A3-3323-01A-02R-1325-07
YTHDC2
16.5128725081007
20.6535652352011
ELAVL1
44.3876796198438
31.8729000784291
-m sample information file containing sample grouping information
Barcode
Patient
Sample
TCGA-BP-4766-01A-01R-1289-07
TCGA-BP-4766
TCGA-BP-4766-01A
TCGA-A3-3352-01A-01R-0864-07
TCGA-A3-3352
TCGA-A3-3352-01A
-- whether log2 transforms the gene expression matrix into log2
-n specifies the column name of the grouping information in the sample information file
-- group1-- group2 specifies the grouping group name
-t specifies the method used to estimate the gene set, which defaults to gsva
-k specifies the kcdf parameter in the gsva function. It is generally set to "Poisson" when using read count data, and the default value "Gaussian" is generally used when using data such as TPM after log.
-p specifies the threshold of p
-- no_diff GSVA will first transform the expression matrix into the enrichment fraction matrix, and then screen the enrichment results through differential expression analysis. Setting this parameter will not screen the enrichment results.
Use example: Rscript. / scripts/ssgsea_enrich_diff.r-g enrich/c2.cp.kegg.v6.2.symbols.gmt\-I. / 02.sample_select/TCGA-KIRC_gene_expression_TPM_immu.tsv-m metadata_group.tsv\-n m6acluster-- log2-- group1 C1-- group2 C2-p 0.05-o enrich/C1_vs_C2 on the "R packet GSVA enrichment analysis method" is introduced here. Thank you for your reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 289
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.