基于多组学数据的肿瘤药物敏感性预测
作者:
基金项目:

国家自然科学基金(31301092,31800700);上海市卫健委协同创新集群项目(2019CXJQ02)


Predicting tumor drug sensitivity with multi-omics data
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [25]
  • |
  • 相似文献
  • | | |
  • 文章评论
    摘要:

    肿瘤药物敏感性预测在指导患者临床用药方面具有重要意义。本文基于癌症药物敏感性基因组学数据库(genomics of drug sensitivity in cancer,GDSC)198种药物的细胞系敏感性IC50数据,通过Stacking集成学习构建了包含基因表达、基因突变、拷贝数变异数据的多组学癌症药物敏感性预测模型。采用多种特征选择方法对基因特征进行降维,使用Stacking方法集成6种初级学习器和1种次级学习器进行建模,采用5折交叉进行模型验证。预测结果中AUC大于0.9的占比为36.4%,在0.8–0.9之间的占比为49.0%,最低AUC为0.682。基于Stacking构建的多组学预测模型较已有单组学和多组学模型的准确性和稳定性具有优势。多组学整合预测药物敏感性优于单一组学。特征基因功能注释和富集分析解析了肿瘤对sorafenib潜在的耐药机制,从生物学角度提供了模型可解释性及其应用于临床用药指导的价值。

    Abstract:

    The prediction of tumor drug sensitivity plays an important role in clinically guiding patients' medication. In this paper, a multi-omics data-based cancer drug sensitivity prediction model was constructed by Stacking ensemble learning method. The data including gene expression, mutation, copy number variation and drug sensitivity value (IC50) of 198 drugs were downloaded from the GDSC database. Multiple feature selection methods were applied for dimensionality reduction. Six primary learners and one secondary learner were integrated into modeling by Stacking method. The model was validated with 5-fold cross-validation. In the prediction results, 36.4% of drug models' AUCs were greater than 0.9, 49.0% of drug models' AUCs were between 0.8-0.9, and the lowest drug model's AUC was 0.682. The multi-omics model for drug sensitivity prediction based on Stacking method is better than the known single-omics or multi-omics model in terms of accuracy and stability. The model based on multi-omics data is better than the single-omics data in predicting drug sensitivity. Function annotation and enrichment analysis of feature genes revealed the potential resistance mechanism of tumors to sorafenib, providing the model interpretability from a biological perspective, and demonstrated the model's potential applicability in clinical medication guidance.

    参考文献
    [1] Chen EY, Raghunathan V, Prasad V. An overview of cancer drugs approved by the US food and drug administration based on the surrogate end point of response rate. JAMA Intern Med, 2019, 179(7):915-921.
    [2] Reuter JA, Spacek DV, Snyder MP. High-throughput sequencing technologies. Mol Cell, 2015, 58(4):586-597.
    [3] Deo RC. Machine learning in medicine. Circulation, 2015, 132(20):1920-1930.
    [4] Geeleher P, Cox NJ, Huang RS. Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. Genome Biol, 2014, 15(3):R47.
    [5] Huang EW, Bhope A, Lim J, et al. Tissue-guided lasso for prediction of clinical drug response using preclinical samples. PLoS Comput Biol, 2020, 16(1):e1007607.
    [6] Sharifi-Noghabi H, Zolotareva O, Collins CC, et al. MOLI:multi-omics late integration with deep neural networks for drug response prediction. Bioinformatics, 2019, 35(14):i501-i509.
    [7] Liu Q, Muglia LJ, Huang LF. Network as a biomarker:a novel network-based sparse Bayesian machine for pathway-driven drug response prediction. Genes, 2019, 10(8):602.
    [8] Emdadi A, Eslahchi C. DSPLMF:a method for cancer drug sensitivity prediction using a novel regularization approach in logistic matrix factorization. Front Genet, 2020, 11:75.
    [9] Chiu YC, Chen HIH, Gorthi A, et al. Deep learning of pharmacogenomics resources:moving towards precision oncology. Brief Bioinform, 2020, 21(6):2066-2083.
    [10] Kurilov R, Haibe-Kains B, Brors B. Assessment of modelling strategies for drug response prediction in cell lines and xenografts. Sci Rep, 2020, 10(1):2849.
    [11] Zhang NQ, Wang HY, Fang Y, et al. Predicting anticancer drug responses using a dual-layer integrated cell line-drug network model. PLoS Comput Biol, 2015, 11(9):e1004498.
    [12] Feng F, Shen BH, Mou XQ, et al. Large-scale pharmacogenomic studies and drug response prediction for personalized cancer medicine. J Genet Genomics, 2021, 48(7):540-551.
    [13] Costello JC, Heiser LM, Georgii E, et al. A community effort to assess and improve drug sensitivity prediction algorithms. Nat Biotechnol, 2014, 32(12):1202-1212.
    [14] Yang WJ, Soares J, Greninger P, et al. Genomics of drug sensitivity in cancer (GDSC):a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res, 2013, 41(Database issue):D955-D961.
    [15] Iorio F, Knijnenburg TA, Vis DJ, et al. A landscape of pharmacogenomic interactions in cancer. Cell, 2016, 166(3):740-754.
    [16] Gu Z, Schlesner M, Hübschmann D. Cola:an R/Bioconductor package for consensus partitioning through a general framwork. Nucleic Acids Res, 2021, 49(3):e15.
    [17] Colaprico A, Silva TC, Olsen C, et al. TCGAbiolinks:an R/bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res, 2016, 44(8):e71.
    [18] Fang Y, Xu PR, Yang JL, et al. A quantile regression forest based method to predict drug response and assess prediction reliability. PLoS One, 2018, 13(10):e0205155.
    [19] Huang C, Mezencev R, McDonald JF, et al. Open source machine-learning algorithms for the prediction of optimal cancer drug therapies. PLoS One, 2017, 12(10):e0186906.
    [20] Chakravarty D, Gao JJ, Phillips SM, et al. OncoKB:a precision oncology knowledge base. JCO Precis Oncol, 2017, 1:1-16.
    [21] Hirota S, Isozaki K, Moriyama Y, et al. Gain-of-function mutations of c-kit in human gastrointestinal stromal tumors. Science, 1998, 279(5350):577-580.
    [22] Corless CL, Barnett CM, Heinrich MC. Gastrointestinal stromal tumours:origin and molecular oncology. Nat Rev Cancer, 2011, 11(12):865-878.
    [23] Yun CH, Mengwasser KE, Toms AV, et al. The T790M mutation in EGFR kinase causes drug resistance by increasing the affinity for ATP. PNAS, 2008, 105(6):2070-2075.
    [24] Kalathil SG, Hutson A, Barbi J, et al. Augmentation of IFN-γ+ CD8+ T cell responses correlates with survival of HCC patients on sorafenib therapy. JCI Insight, 2019, 4(15):e130116.
    [25] Tang WW, Chen ZY, Zhang WL, et al. The mechanisms of sorafenib resistance in hepatocellular carcinoma:theoretical basis and therapeutic aspects. Signal Transduct Target Ther, 2020, 5(1):87.
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

杨晨雨,刘振浩,代培斌,张钰,黄鹏杰,林勇,谢鹭. 基于多组学数据的肿瘤药物敏感性预测[J]. 生物工程学报, 2022, 38(6): 2201-2212

复制
分享
文章指标
  • 点击次数:621
  • 下载次数: 2575
  • HTML阅读次数: 2011
  • 引用次数: 0
历史
  • 收稿日期:2021-09-04
  • 在线发布日期: 2022-06-28
  • 出版日期: 2022-06-25
文章二维码
您是第5991426位访问者
生物工程学报 ® 2025 版权所有

通信地址:中国科学院微生物研究所    邮编:100101

电话:010-64807509   E-mail:cjb@im.ac.cn

技术支持:北京勤云科技发展有限公司