Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

A case study of pharmacogenomics consistency between GDSC and CELL databases

2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of pharmacogenomics consistency case analysis of GDSC and CELL database, the content is detailed and easy to understand, the operation is simple and quick, and has certain reference value. I believe that everyone will gain something after reading this article on pharmacogenomics consistency case analysis of GDSC and CELL database. Let's take a look at it.

Drug sensitivity databases of tumor cell lines, such as GDSC and CELL, provide drug sensitivity information and corresponding genomic information of tumor cell lines, but some studies suggest that the data of the two databases are different, and some studies have confirmed the credible consistency of the two big data sets.

Results 1. Comparison of pharmacological data sets of cell lines

Fig 1a: CCLE and GDSC drug screening data.

There are the same 471 cell lines in CCLE and GDSC databases with related genomic data, but only some of them have overlapping drug screening data: the overlapping range of each compound is 82-256 cell lines (median = 94; mean = 157).

Fig 1b-1c: the 50% inhibitory concentration (IC50) and the area under the curve (AUC;1-AUC was called the active area in CCLE) were used to analyze the pearson correlation between CCLE and GDSC drug sensitivity index. All the results are shown in the supplementary picture.

In the case of direct GDSC-CCLE comparison, the AUC and IC50 distributions of almost all compounds (13lap15) were mainly drug-insensitive lineages, while drug-sensitive lines were much less. The violin map shows the complete distribution of all CCLE and GDSC AUC values of each compound, while the scatter map shows the distribution of overlapping cell lines. The result of IC50 value is similar to that of AUC, as shown in the supplementary figure. There were almost no sensitive cell lines in overlapping cell lines with several targeted drugs (for example, 2 clozotinib, 3 nilotinib, 2 TAE684, 0 erlotinib or sorafinib). The relative lack of sensitive cell lines in overlapping cell lines limits the level of correlation that can be achieved by the two data sets.

Fig 1d-1e: pearson correlation (y-axis) is stronger than spearson correlation (x-axis) analysis.

Statistical ability of correlation analysis: in most cases, correlation analysis can maintain good consistency in the case of imbalance between sensitive and insensitive cell coefficient items and differences in original analysis methods. Haibe-kains et al. calculated the correlation between the two data sets based on the Spearman correlation coefficient. Comparing the correlation coefficient of Pearson with that of Spearman, the correlation coefficient of most drugs has been significantly improved. However, some correlation values are still poor, which may be due to biological differences in cell lines, actual pharmacological measurements (such as nutlin-3, paclitaxel and PHA665752), or only one sensitive line for a drug (e.g. erlotinib and sorafenib).

Fig 1f: a waterfall map was used to classify resistant and sensitive strains, and this study (y-axis) was compared with the Cohen'kappa coefficient (consistency coefficient of the two variables) studied by haibe-kains.

Of the 13 related compounds, an average of 94% of the cell lines (CCLE= 94%, range = 77-100% × GDSC = 96%, range = 86-100%) were clustered in the range of drug insensitivity (for example, IC50 values of most compounds > 1 μ M). Waterfall analysis also showed that there was a high consistency of cell lines classified as "sensitive" or "resistant" between CCLE and GDSC data (reflected by cohen'kappa coefficient). Extended data figure 3: this consistency is also evident in all tested drugs when a simple drug sensitivity threshold (1 μ M) is used. Waterfall map and simple threshold method show higher consistency than haibe-kains et al, indicating that the pharmacological screening data of CCLE and GDSC cell lines are suitable for modeling research, so as to distinguish rare drug sensitive cell lines. Results 2. Comparison of predictive indexes of drug sensitivity

To explore the extent to which the assembly of CCLE and GDSC cell lines can clarify the common genetic or molecular basis for the efficacy of anticancer drugs.

2.1 Analysis of Variance (ANOVA)

Analysis of variance (ANOVA) was performed using overlapping cell lines on CCLE and GDSC to determine whether the molecular correlation of drug responses between the two data sets was consistent. A total of two models were used, and the predictive variables were IC50 or active area (1-AUC) scores, respectively. Both models regarded the origin tissue as covariates and the mutation states of 71 oncogenes as independent variables.

Fig 2A: ANOVA analysis identified the known genetic markers most associated with drug resistance or sensitivity in 13 compounds (GDSC or CELL) and 8 compounds (GDSC and CELL).

Based on the ANOVA model with IC50 value, the gene markers found in both data sets include NRAS mutation (sensitive to MEK inhibitor PD0325901), BRAF mutation (sensitive to BRAF inhibitor PLX4720), BCR-ABL1 fusion gene (sensitive to a variety of ABL1 inhibitors, such as nilotinib,AZD0530) and ERBB2 amplification (sensitive to ERBB2 inhibitor lapatinib). Based on the activity score and IC50's ANOVA model, the consistent association of drug resistance between the two data sets was obtained, such as the TP53 mutation of drug resistance to nutlin-3. ANOVA analysis based on activity scores showed that 14 drugs from GDSC and 15 drugs from CCLE showed tissue origin-specific correlations, which were consistent across data sets (ex post Welch t test, see figure 5 of the extended data). 2.2 Elastic network regression and ridge regression analysis

Extended data figure 6: 21013 genomic features (including expression, copy number changes and mutations) were analyzed by multivariate analysis to more comprehensively assess the consistency of genome predictors.

Use the complete dataset available for each study or use only overlapping datasets for elastic network regression. This analysis produces strong predictive molecules, and the overlap of predictive molecules is very significant (χ 2 P 80% of these characteristics are determined to have consistent directivity (standardization effects are both positive or negative). In some cases, the initial elastic network regression cannot determine the predictive variables, which is usually partly due to a small number of drug-sensitive cell lines. On the other hand, some drugs that show low correlation based on AUC or IC50 can still identify consistent predictors (for example, nutlin-3).

This is the end of the article on "case analysis of pharmacogenomics consistency between GDSC and CELL databases". Thank you for reading! I believe you all have a certain understanding of "case analysis of pharmacogenomics consistency in GDSC and CELL databases". If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report