Combining multidimensional genomic measurements for predicting cancer prognosis

Observations from TCGA

Qing Zhao, Xingjie Shi, Yang Xie, Jian Huang, Chang BenShia, Shuangge Ma

Research output: Contribution to journalArticle

59 Citations (Scopus)

Abstract

With accumulating research on the interconnections among different types of genomic regulations, researchers have found that multidimensional genomic studies outperform one-dimensional studies in multiple aspects. Among many sources of multidimensional genomic data, The Cancer Genome Atlas (TCGA) provides the public with comprehensive profiling data on >30 cancer types, making it an ideal test bed for conducting and comparing different analyses. In this article, the analysis goal is to apply several existingmethods and associate multidimensional genomic measurements with cancer outcomes in particular prognosis, with special focus on the predictive power of genomic signatures. We exploit clinical data and four types of genomic measurement including mRNA gene expression, DNA methylation, microRNA and copy number alterations for breast invasive carcinoma, glioblastoma multiforme, acute myeloid leukemia and lung squamous cell carcinoma collected by TCGA. To accommodate the high dimensionality, we extract important features using Principal Component Analysis, Partial Least Squares and Least Absolute Shrinkage and Selection Operator (Lasso), which are representative of dimension reduction and variable selection techniques and have been extensively adopted, and fit Cox survival models with combined important features.We calibrate the predictive power of each type of genomic measurement for the prognosis of four cancer types and find that the results vary across cancers. Our analysis also suggests that for most of the cancers in our study and the adopted methods, there is no substantial improvement in prediction when adding other genomic measurement after gene expression and clinical covariates have been included in the model. This is consistent with the findings that molecular features measured at the transcription level affect clinical outcomes more directly than those measured at the DNA/epigenetic level.

Original languageEnglish (US)
Article numberbbu003
Pages (from-to)291-303
Number of pages13
JournalBriefings in Bioinformatics
Volume16
Issue number2
DOIs
StatePublished - Mar 1 2015

Fingerprint

Atlases
Genes
Genome
Gene expression
Neoplasms
Transcription
Principal component analysis
DNA
Gene Expression
DNA Methylation
Glioblastoma
Principal Component Analysis
Least-Squares Analysis
MicroRNAs
Proportional Hazards Models
Acute Myeloid Leukemia
Epigenomics
Squamous Cell Carcinoma
Research Personnel
Breast Neoplasms

Keywords

  • Cancer prognosis
  • Multidimensional genomic study
  • Prediction
  • The cancer genome atlas (TCGA)

ASJC Scopus subject areas

  • Molecular Biology
  • Information Systems

Cite this

Combining multidimensional genomic measurements for predicting cancer prognosis : Observations from TCGA. / Zhao, Qing; Shi, Xingjie; Xie, Yang; Huang, Jian; BenShia, Chang; Ma, Shuangge.

In: Briefings in Bioinformatics, Vol. 16, No. 2, bbu003, 01.03.2015, p. 291-303.

Research output: Contribution to journalArticle

Zhao, Qing ; Shi, Xingjie ; Xie, Yang ; Huang, Jian ; BenShia, Chang ; Ma, Shuangge. / Combining multidimensional genomic measurements for predicting cancer prognosis : Observations from TCGA. In: Briefings in Bioinformatics. 2015 ; Vol. 16, No. 2. pp. 291-303.
@article{13105599e69a4dd7a3a4e3409e76a120,
title = "Combining multidimensional genomic measurements for predicting cancer prognosis: Observations from TCGA",
abstract = "With accumulating research on the interconnections among different types of genomic regulations, researchers have found that multidimensional genomic studies outperform one-dimensional studies in multiple aspects. Among many sources of multidimensional genomic data, The Cancer Genome Atlas (TCGA) provides the public with comprehensive profiling data on >30 cancer types, making it an ideal test bed for conducting and comparing different analyses. In this article, the analysis goal is to apply several existingmethods and associate multidimensional genomic measurements with cancer outcomes in particular prognosis, with special focus on the predictive power of genomic signatures. We exploit clinical data and four types of genomic measurement including mRNA gene expression, DNA methylation, microRNA and copy number alterations for breast invasive carcinoma, glioblastoma multiforme, acute myeloid leukemia and lung squamous cell carcinoma collected by TCGA. To accommodate the high dimensionality, we extract important features using Principal Component Analysis, Partial Least Squares and Least Absolute Shrinkage and Selection Operator (Lasso), which are representative of dimension reduction and variable selection techniques and have been extensively adopted, and fit Cox survival models with combined important features.We calibrate the predictive power of each type of genomic measurement for the prognosis of four cancer types and find that the results vary across cancers. Our analysis also suggests that for most of the cancers in our study and the adopted methods, there is no substantial improvement in prediction when adding other genomic measurement after gene expression and clinical covariates have been included in the model. This is consistent with the findings that molecular features measured at the transcription level affect clinical outcomes more directly than those measured at the DNA/epigenetic level.",
keywords = "Cancer prognosis, Multidimensional genomic study, Prediction, The cancer genome atlas (TCGA)",
author = "Qing Zhao and Xingjie Shi and Yang Xie and Jian Huang and Chang BenShia and Shuangge Ma",
year = "2015",
month = "3",
day = "1",
doi = "10.1093/bib/bbu003",
language = "English (US)",
volume = "16",
pages = "291--303",
journal = "Briefings in Bioinformatics",
issn = "1467-5463",
publisher = "Oxford University Press",
number = "2",

}

TY - JOUR

T1 - Combining multidimensional genomic measurements for predicting cancer prognosis

T2 - Observations from TCGA

AU - Zhao, Qing

AU - Shi, Xingjie

AU - Xie, Yang

AU - Huang, Jian

AU - BenShia, Chang

AU - Ma, Shuangge

PY - 2015/3/1

Y1 - 2015/3/1

N2 - With accumulating research on the interconnections among different types of genomic regulations, researchers have found that multidimensional genomic studies outperform one-dimensional studies in multiple aspects. Among many sources of multidimensional genomic data, The Cancer Genome Atlas (TCGA) provides the public with comprehensive profiling data on >30 cancer types, making it an ideal test bed for conducting and comparing different analyses. In this article, the analysis goal is to apply several existingmethods and associate multidimensional genomic measurements with cancer outcomes in particular prognosis, with special focus on the predictive power of genomic signatures. We exploit clinical data and four types of genomic measurement including mRNA gene expression, DNA methylation, microRNA and copy number alterations for breast invasive carcinoma, glioblastoma multiforme, acute myeloid leukemia and lung squamous cell carcinoma collected by TCGA. To accommodate the high dimensionality, we extract important features using Principal Component Analysis, Partial Least Squares and Least Absolute Shrinkage and Selection Operator (Lasso), which are representative of dimension reduction and variable selection techniques and have been extensively adopted, and fit Cox survival models with combined important features.We calibrate the predictive power of each type of genomic measurement for the prognosis of four cancer types and find that the results vary across cancers. Our analysis also suggests that for most of the cancers in our study and the adopted methods, there is no substantial improvement in prediction when adding other genomic measurement after gene expression and clinical covariates have been included in the model. This is consistent with the findings that molecular features measured at the transcription level affect clinical outcomes more directly than those measured at the DNA/epigenetic level.

AB - With accumulating research on the interconnections among different types of genomic regulations, researchers have found that multidimensional genomic studies outperform one-dimensional studies in multiple aspects. Among many sources of multidimensional genomic data, The Cancer Genome Atlas (TCGA) provides the public with comprehensive profiling data on >30 cancer types, making it an ideal test bed for conducting and comparing different analyses. In this article, the analysis goal is to apply several existingmethods and associate multidimensional genomic measurements with cancer outcomes in particular prognosis, with special focus on the predictive power of genomic signatures. We exploit clinical data and four types of genomic measurement including mRNA gene expression, DNA methylation, microRNA and copy number alterations for breast invasive carcinoma, glioblastoma multiforme, acute myeloid leukemia and lung squamous cell carcinoma collected by TCGA. To accommodate the high dimensionality, we extract important features using Principal Component Analysis, Partial Least Squares and Least Absolute Shrinkage and Selection Operator (Lasso), which are representative of dimension reduction and variable selection techniques and have been extensively adopted, and fit Cox survival models with combined important features.We calibrate the predictive power of each type of genomic measurement for the prognosis of four cancer types and find that the results vary across cancers. Our analysis also suggests that for most of the cancers in our study and the adopted methods, there is no substantial improvement in prediction when adding other genomic measurement after gene expression and clinical covariates have been included in the model. This is consistent with the findings that molecular features measured at the transcription level affect clinical outcomes more directly than those measured at the DNA/epigenetic level.

KW - Cancer prognosis

KW - Multidimensional genomic study

KW - Prediction

KW - The cancer genome atlas (TCGA)

UR - http://www.scopus.com/inward/record.url?scp=84925442200&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84925442200&partnerID=8YFLogxK

U2 - 10.1093/bib/bbu003

DO - 10.1093/bib/bbu003

M3 - Article

VL - 16

SP - 291

EP - 303

JO - Briefings in Bioinformatics

JF - Briefings in Bioinformatics

SN - 1467-5463

IS - 2

M1 - bbu003

ER -