Identification of potential genomic biomarkers for Sjögren’s syndrome using data pooling of gene expression microarrays

Sadik A. Khuder, Ibtisam Al-Hashimi, Anand B. Mutgi, Nezam Altorok

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Sjögren’s syndrome (SS) is an autoimmune disease characterized by lymphocytic infiltration and destruction of salivary and lacrimal glands. The diagnosis of SS can be challenging due to lack of a specific test for the disease. The purpose of this study is to examine the accuracy of using gene expression profile for diagnosis of SS. We identified 9 publically available datasets that included gene expression data from saliva and salivary gland biopsy samples of 52 patients with SS and 51 controls. Out of these datasets, we compiled and pooled data from three datasets that included 37 and 29 samples from SS patients and healthy controls, respectively, which were designated as “training set.” Then, we performed cross-listing in a group of independent gene expression datasets from patients with SS to identify consensus gene list of differentially expressed genes. We performed Linear Discriminant Analysis (LDA) to quantify the accuracy of discriminating genes to predict SS in both the “training set” and an independent group of datasets that was designated as “test set.” We identified 55 genes as potential classifier genes to differentiate SS from healthy controls. An LDA by leave-one-out cross-validation method identified 19 genes (EPSTI1, IFI44, IFI44L, IFIT1, IFIT2, IFIT3, MX1, OAS1, SAMD9L, PSMB9, STAT1, HERC5, EV12B, CD53, SELL, HLA-DQA1, PTPRC, B2M, and TAP2) with highest classification accuracy rate (95.7 %). Moreover, we validated our results by reproducing the same gene expression profile as a discriminatory test in the “test set,” which included data from salivary gland samples of 15 patients with SS and 22 controls with 94.6 % accuracy. We propose that gene expression profile in the saliva or salivary glands could represent a promising simple and reproducible diagnostic biomarker for SS.

Original languageEnglish (US)
Pages (from-to)829-836
Number of pages8
JournalRheumatology International
Volume35
Issue number5
DOIs
StatePublished - May 1 2015

Fingerprint

Meta-Analysis
Biomarkers
Gene Expression
Salivary Glands
Transcriptome
Genes
Discriminant Analysis
Saliva
Lacrimal Apparatus
Autoimmune Diseases
Datasets
Biopsy

Keywords

  • Biomarkers
  • Gene expression
  • Sjogren’s syndrome

ASJC Scopus subject areas

  • Immunology and Allergy
  • Rheumatology
  • Immunology

Cite this

Identification of potential genomic biomarkers for Sjögren’s syndrome using data pooling of gene expression microarrays. / Khuder, Sadik A.; Al-Hashimi, Ibtisam; Mutgi, Anand B.; Altorok, Nezam.

In: Rheumatology International, Vol. 35, No. 5, 01.05.2015, p. 829-836.

Research output: Contribution to journalArticle

Khuder, Sadik A. ; Al-Hashimi, Ibtisam ; Mutgi, Anand B. ; Altorok, Nezam. / Identification of potential genomic biomarkers for Sjögren’s syndrome using data pooling of gene expression microarrays. In: Rheumatology International. 2015 ; Vol. 35, No. 5. pp. 829-836.
@article{028c31005d404ca1bcd36a759c057266,
title = "Identification of potential genomic biomarkers for Sj{\"o}gren’s syndrome using data pooling of gene expression microarrays",
abstract = "Sj{\"o}gren’s syndrome (SS) is an autoimmune disease characterized by lymphocytic infiltration and destruction of salivary and lacrimal glands. The diagnosis of SS can be challenging due to lack of a specific test for the disease. The purpose of this study is to examine the accuracy of using gene expression profile for diagnosis of SS. We identified 9 publically available datasets that included gene expression data from saliva and salivary gland biopsy samples of 52 patients with SS and 51 controls. Out of these datasets, we compiled and pooled data from three datasets that included 37 and 29 samples from SS patients and healthy controls, respectively, which were designated as “training set.” Then, we performed cross-listing in a group of independent gene expression datasets from patients with SS to identify consensus gene list of differentially expressed genes. We performed Linear Discriminant Analysis (LDA) to quantify the accuracy of discriminating genes to predict SS in both the “training set” and an independent group of datasets that was designated as “test set.” We identified 55 genes as potential classifier genes to differentiate SS from healthy controls. An LDA by leave-one-out cross-validation method identified 19 genes (EPSTI1, IFI44, IFI44L, IFIT1, IFIT2, IFIT3, MX1, OAS1, SAMD9L, PSMB9, STAT1, HERC5, EV12B, CD53, SELL, HLA-DQA1, PTPRC, B2M, and TAP2) with highest classification accuracy rate (95.7 {\%}). Moreover, we validated our results by reproducing the same gene expression profile as a discriminatory test in the “test set,” which included data from salivary gland samples of 15 patients with SS and 22 controls with 94.6 {\%} accuracy. We propose that gene expression profile in the saliva or salivary glands could represent a promising simple and reproducible diagnostic biomarker for SS.",
keywords = "Biomarkers, Gene expression, Sjogren’s syndrome",
author = "Khuder, {Sadik A.} and Ibtisam Al-Hashimi and Mutgi, {Anand B.} and Nezam Altorok",
year = "2015",
month = "5",
day = "1",
doi = "10.1007/s00296-014-3152-6",
language = "English (US)",
volume = "35",
pages = "829--836",
journal = "Rheumatology International",
issn = "0172-8172",
publisher = "Springer Verlag",
number = "5",

}

TY - JOUR

T1 - Identification of potential genomic biomarkers for Sjögren’s syndrome using data pooling of gene expression microarrays

AU - Khuder, Sadik A.

AU - Al-Hashimi, Ibtisam

AU - Mutgi, Anand B.

AU - Altorok, Nezam

PY - 2015/5/1

Y1 - 2015/5/1

N2 - Sjögren’s syndrome (SS) is an autoimmune disease characterized by lymphocytic infiltration and destruction of salivary and lacrimal glands. The diagnosis of SS can be challenging due to lack of a specific test for the disease. The purpose of this study is to examine the accuracy of using gene expression profile for diagnosis of SS. We identified 9 publically available datasets that included gene expression data from saliva and salivary gland biopsy samples of 52 patients with SS and 51 controls. Out of these datasets, we compiled and pooled data from three datasets that included 37 and 29 samples from SS patients and healthy controls, respectively, which were designated as “training set.” Then, we performed cross-listing in a group of independent gene expression datasets from patients with SS to identify consensus gene list of differentially expressed genes. We performed Linear Discriminant Analysis (LDA) to quantify the accuracy of discriminating genes to predict SS in both the “training set” and an independent group of datasets that was designated as “test set.” We identified 55 genes as potential classifier genes to differentiate SS from healthy controls. An LDA by leave-one-out cross-validation method identified 19 genes (EPSTI1, IFI44, IFI44L, IFIT1, IFIT2, IFIT3, MX1, OAS1, SAMD9L, PSMB9, STAT1, HERC5, EV12B, CD53, SELL, HLA-DQA1, PTPRC, B2M, and TAP2) with highest classification accuracy rate (95.7 %). Moreover, we validated our results by reproducing the same gene expression profile as a discriminatory test in the “test set,” which included data from salivary gland samples of 15 patients with SS and 22 controls with 94.6 % accuracy. We propose that gene expression profile in the saliva or salivary glands could represent a promising simple and reproducible diagnostic biomarker for SS.

AB - Sjögren’s syndrome (SS) is an autoimmune disease characterized by lymphocytic infiltration and destruction of salivary and lacrimal glands. The diagnosis of SS can be challenging due to lack of a specific test for the disease. The purpose of this study is to examine the accuracy of using gene expression profile for diagnosis of SS. We identified 9 publically available datasets that included gene expression data from saliva and salivary gland biopsy samples of 52 patients with SS and 51 controls. Out of these datasets, we compiled and pooled data from three datasets that included 37 and 29 samples from SS patients and healthy controls, respectively, which were designated as “training set.” Then, we performed cross-listing in a group of independent gene expression datasets from patients with SS to identify consensus gene list of differentially expressed genes. We performed Linear Discriminant Analysis (LDA) to quantify the accuracy of discriminating genes to predict SS in both the “training set” and an independent group of datasets that was designated as “test set.” We identified 55 genes as potential classifier genes to differentiate SS from healthy controls. An LDA by leave-one-out cross-validation method identified 19 genes (EPSTI1, IFI44, IFI44L, IFIT1, IFIT2, IFIT3, MX1, OAS1, SAMD9L, PSMB9, STAT1, HERC5, EV12B, CD53, SELL, HLA-DQA1, PTPRC, B2M, and TAP2) with highest classification accuracy rate (95.7 %). Moreover, we validated our results by reproducing the same gene expression profile as a discriminatory test in the “test set,” which included data from salivary gland samples of 15 patients with SS and 22 controls with 94.6 % accuracy. We propose that gene expression profile in the saliva or salivary glands could represent a promising simple and reproducible diagnostic biomarker for SS.

KW - Biomarkers

KW - Gene expression

KW - Sjogren’s syndrome

UR - http://www.scopus.com/inward/record.url?scp=84939953131&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84939953131&partnerID=8YFLogxK

U2 - 10.1007/s00296-014-3152-6

DO - 10.1007/s00296-014-3152-6

M3 - Article

C2 - 25327574

AN - SCOPUS:84939953131

VL - 35

SP - 829

EP - 836

JO - Rheumatology International

JF - Rheumatology International

SN - 0172-8172

IS - 5

ER -