A heterogeneous label propagation algorithm for disease gene discovery

TaeHyun Hwang, Rui Kuang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Citations (Scopus)

Abstract

Label propagation is an effective and efficient technique to utilize local and global features in a network for semi-supervised learning. In the literature, one challenge is how to propagate information in heterogeneous networks comprising several subnetworks, each of which has its own cluster structures that need to be explored independently. In this paper, we introduce an intutitive algorithm MINProp (Mutual Interaction-based Network Propagation) and a simple regularization framework for propagating information between subnetworks in a heterogeneous network. MINProp sequentially performs label propagation on each individual subnetwork with the current label information derived from the other subnetworks and repeats this step until convergence to the global optimal solution to the convex objective function of the regular-ization framework. The independent label propagation on each subnetwork explores the cluster structure in the subnetwork. The label information from the other subnetworks is used to capture mutual interactions (bicluster structures) between the vertices in each pair of the subnetworks. MINProp algorithm is applied to disease gene discovery from a heterogeneus network of disease phenotypes and genes. In the experiments, MINProp significantly output-performed the original label propagation algorithm on a single network and the state-of-the-art methods for discovering disease genes. The results also suggest that MINProp is more effective in utilizing the modular structures in a heterogenous network. Finally, MINProp discovered new disease-gene associations that are only reported recently.

Original languageEnglish (US)
Title of host publicationProceedings of the 10th SIAM International Conference on Data Mining, SDM 2010
Pages583-594
Number of pages12
StatePublished - 2010
Event10th SIAM International Conference on Data Mining, SDM 2010 - Columbus, OH, United States
Duration: Apr 29 2010May 1 2010

Other

Other10th SIAM International Conference on Data Mining, SDM 2010
CountryUnited States
CityColumbus, OH
Period4/29/105/1/10

Fingerprint

Labels
Genes
Heterogeneous networks
Supervised learning
Experiments

Keywords

  • Data integration
  • Disease gene prioritization
  • Label propagation
  • Random walk
  • Semi-supervised learning

ASJC Scopus subject areas

  • Software

Cite this

Hwang, T., & Kuang, R. (2010). A heterogeneous label propagation algorithm for disease gene discovery. In Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010 (pp. 583-594)

A heterogeneous label propagation algorithm for disease gene discovery. / Hwang, TaeHyun; Kuang, Rui.

Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010. 2010. p. 583-594.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hwang, T & Kuang, R 2010, A heterogeneous label propagation algorithm for disease gene discovery. in Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010. pp. 583-594, 10th SIAM International Conference on Data Mining, SDM 2010, Columbus, OH, United States, 4/29/10.
Hwang T, Kuang R. A heterogeneous label propagation algorithm for disease gene discovery. In Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010. 2010. p. 583-594
Hwang, TaeHyun ; Kuang, Rui. / A heterogeneous label propagation algorithm for disease gene discovery. Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010. 2010. pp. 583-594
@inproceedings{1981c3b3f87a4828984613e65a7040ba,
title = "A heterogeneous label propagation algorithm for disease gene discovery",
abstract = "Label propagation is an effective and efficient technique to utilize local and global features in a network for semi-supervised learning. In the literature, one challenge is how to propagate information in heterogeneous networks comprising several subnetworks, each of which has its own cluster structures that need to be explored independently. In this paper, we introduce an intutitive algorithm MINProp (Mutual Interaction-based Network Propagation) and a simple regularization framework for propagating information between subnetworks in a heterogeneous network. MINProp sequentially performs label propagation on each individual subnetwork with the current label information derived from the other subnetworks and repeats this step until convergence to the global optimal solution to the convex objective function of the regular-ization framework. The independent label propagation on each subnetwork explores the cluster structure in the subnetwork. The label information from the other subnetworks is used to capture mutual interactions (bicluster structures) between the vertices in each pair of the subnetworks. MINProp algorithm is applied to disease gene discovery from a heterogeneus network of disease phenotypes and genes. In the experiments, MINProp significantly output-performed the original label propagation algorithm on a single network and the state-of-the-art methods for discovering disease genes. The results also suggest that MINProp is more effective in utilizing the modular structures in a heterogenous network. Finally, MINProp discovered new disease-gene associations that are only reported recently.",
keywords = "Data integration, Disease gene prioritization, Label propagation, Random walk, Semi-supervised learning",
author = "TaeHyun Hwang and Rui Kuang",
year = "2010",
language = "English (US)",
pages = "583--594",
booktitle = "Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010",

}

TY - GEN

T1 - A heterogeneous label propagation algorithm for disease gene discovery

AU - Hwang, TaeHyun

AU - Kuang, Rui

PY - 2010

Y1 - 2010

N2 - Label propagation is an effective and efficient technique to utilize local and global features in a network for semi-supervised learning. In the literature, one challenge is how to propagate information in heterogeneous networks comprising several subnetworks, each of which has its own cluster structures that need to be explored independently. In this paper, we introduce an intutitive algorithm MINProp (Mutual Interaction-based Network Propagation) and a simple regularization framework for propagating information between subnetworks in a heterogeneous network. MINProp sequentially performs label propagation on each individual subnetwork with the current label information derived from the other subnetworks and repeats this step until convergence to the global optimal solution to the convex objective function of the regular-ization framework. The independent label propagation on each subnetwork explores the cluster structure in the subnetwork. The label information from the other subnetworks is used to capture mutual interactions (bicluster structures) between the vertices in each pair of the subnetworks. MINProp algorithm is applied to disease gene discovery from a heterogeneus network of disease phenotypes and genes. In the experiments, MINProp significantly output-performed the original label propagation algorithm on a single network and the state-of-the-art methods for discovering disease genes. The results also suggest that MINProp is more effective in utilizing the modular structures in a heterogenous network. Finally, MINProp discovered new disease-gene associations that are only reported recently.

AB - Label propagation is an effective and efficient technique to utilize local and global features in a network for semi-supervised learning. In the literature, one challenge is how to propagate information in heterogeneous networks comprising several subnetworks, each of which has its own cluster structures that need to be explored independently. In this paper, we introduce an intutitive algorithm MINProp (Mutual Interaction-based Network Propagation) and a simple regularization framework for propagating information between subnetworks in a heterogeneous network. MINProp sequentially performs label propagation on each individual subnetwork with the current label information derived from the other subnetworks and repeats this step until convergence to the global optimal solution to the convex objective function of the regular-ization framework. The independent label propagation on each subnetwork explores the cluster structure in the subnetwork. The label information from the other subnetworks is used to capture mutual interactions (bicluster structures) between the vertices in each pair of the subnetworks. MINProp algorithm is applied to disease gene discovery from a heterogeneus network of disease phenotypes and genes. In the experiments, MINProp significantly output-performed the original label propagation algorithm on a single network and the state-of-the-art methods for discovering disease genes. The results also suggest that MINProp is more effective in utilizing the modular structures in a heterogenous network. Finally, MINProp discovered new disease-gene associations that are only reported recently.

KW - Data integration

KW - Disease gene prioritization

KW - Label propagation

KW - Random walk

KW - Semi-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=80053439692&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053439692&partnerID=8YFLogxK

M3 - Conference contribution

SP - 583

EP - 594

BT - Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010

ER -