RCRdiff: A fully integrated Bayesian method for differential expression analysis using raw NanoString nCounter data

Can Xu, Xinlei Wang, Johan Lim, Guanghua Xiao, Yang Xie

Research output: Contribution to journalArticlepeer-review

Abstract

The medium-throughput mRNA abundance platform NanoString nCounter has gained great popularity in the past decade, due to its high sensitivity and technical reproducibility as well as remarkable applicability to ubiquitous formalin fixed paraffin embedded (FFPE) tissue samples. Based on RCRnorm developed for normalizing NanoString nCounter data and Bayesian LASSO for variable selection, we propose a fully integrated Bayesian method, called RCRdiff, to detect differentially expressed (DE) genes between different groups of tissue samples (eg, normal and cancer). Unlike existing methods that often require normalization performed beforehand, RCRdiff directly handles raw read counts and jointly models the behaviors of different types of internal controls along with DE and non-DE gene patterns. Doing so would avoid efficiency loss caused by ignoring estimation uncertainty from the normalization step in a sequential approach and thus can offer more reliable statistical inference. We also propose clustering-based strategies for DE gene selection, which do not require any external dataset and are free of any arbitrary cutoff. Empirical evidence of the attractiveness of RCRdiff is demonstrated via extensive simulation and data examples.

Original languageEnglish (US)
JournalStatistics in Medicine
DOIs
StateAccepted/In press - 2021

Keywords

  • Bayesian LASSO
  • FFPE
  • gene expression
  • gene selection
  • normalization
  • random-coefficient regression

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'RCRdiff: A fully integrated Bayesian method for differential expression analysis using raw NanoString nCounter data'. Together they form a unique fingerprint.

Cite this