Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What database is MSigDB?

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "what is MSigDB database", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "what is MSigDB database" this article.

Gene Set Enrichment Analysis, whose Chinese name is gene set enrichment analysis, is an enrichment method proposed by scientists of Broad Institute Research Institute. This method is also provided with the analysis software GSEA and a gene set database MSigdb. This chapter mainly introduces this database, the official website is as follows

Http://software.broadinstitute.org/gsea/msigdb/index.jsp

For human genes, many gene sets are constructed from the point of view of location, function, metabolic pathway, target binding and so on. There are many genes with similar positions or similar functions in one gene set, and the gene sets constructed by Broad Institute Institute are saved in the MSigDB database.

The database is constantly updated and improved, the latest version is v6.2, updated in July 2018, a total of 17810 gene sets. The number of gene sets included in different versions is as follows

With so much data, it certainly needs to be sorted out. In MSigDB, all gene sets are divided into the following eight categories.

1. H: hallmark gene sets

This category contains a supergene set consisting of a plurality of known gene sets, and each H gene set corresponds to a plurality of basic gene sets of other categories. For example, HALLMARK_ADIPOGENESIS corresponds to 36 gene sets.

2. C1: positional gene sets

This category contains a collection of genes corresponding to different cytoband regions on each human chromosome. The secondary classification was carried out according to different chromosome numbers.

3. C2:curated gene sets

This category contains known databases, literature, and gene set information supported by experts, and contains the following five secondary classifications

KEGG, for example, contains 186 gene sets, each of which essentially corresponds to a pathway in the pathway database. For example, the gene set KEGG_ABC_TRANSPORTERS corresponds to the hsa02010 in the pathway database.

4. C3: motif gene sets

This category contains a set of genes such as miRNA target genes and transcription factor binding regions, as shown below

Both transcription factors and miRNA identify binding regions through specific motif sequences. These gene sets are essentially gene sets with the same motif sequence, such as AAACCAC_MIR140, which have the same AAACCACmotif, while hsa-miR-140 can recognize the motif and then bind, so AAACCAC_MIR140 is a collection of hsa-miR-140 target genes.

5. C4: computational gene sets

This category contains a set of genes predicted by computer software, mainly cancer-related genes, as shown below

6. C5: GO gene sets

This category contains the set of genes corresponding to Gene Ontology, which is divided into the following three categories

Each gene set corresponds to a GO term, for example, the gene set GO_MOLTING_CYCLE corresponds to GO:0042303.

7. C6: oncogenic signatures

This category contains genes whose gene expression changes after known conditional treatment, such as genes whose expression is down-regulated after AKT_UP.V1_DN corresponding to RAD001 reagent treatment.

8. C7: immunologic signatures

This category contains a collection of genes related to the function of the immune system.

These gene sets can be easily searched on the official website, with the following links

Http://software.broadinstitute.org/gsea/msigdb/genesets.jsp

Select the category you are interested in, and then you can see all the gene sets under that category at the bottom of the page, as shown below

I chose the C1 category, the gene set on chromosome 2. Chr2p is the name of each gene set. Click to view the specific information. The example is as follows

The name and description of the gene set can be seen on the results page, and it can also be downloaded directly, with a variety of formats to choose from. The official website also provides a download function, downloading all gene sets at once, and you need to register to use this function.

For GSEA, it is not only an improvement of the enrichment analysis algorithm, but also a high sublimation of the research perspective. Traditional enrichment analysis can only analyze functional databases such as GO and pathway, while MSigDB provides multi-faceted research ideas, which can be explored not only from the perspective of function, but also from the point of view of location and expression trend, which greatly enriches and expands the research object of enrichment analysis.

The above is all the contents of this article "what is the database of MSigDB?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report