A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications

Xue Li, Xinlei Wang, Guanghua Xiao

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Rank aggregation (RA), the process of combining multiple ranked lists into a single ranking, has played an important role in integrating information from individual genomic studies that address the same biological question. In previous research, attention has been focused on aggregating full lists. However, partial and/or top ranked lists are prevalent because of the great heterogeneity of genomic studies and limited resources for follow-up investigation. To be able to handle such lists, some ad hoc adjustments have been suggested in the past, but how RA methods perform on them (after the adjustments) has never been fully evaluated. In this article, a systematic framework is proposed to define different situations that may occur based on the nature of individually ranked lists. A comprehensive simulation study is conducted to examine the performance characteristics of a collection of existing RA methods that are suitable for genomic applications under various settings simulated to mimic practical situations. A non-small cell lung cancer data example is provided for further comparison. Based on our numerical results, general guidelines about which methods perform the best/worst, and under what conditions, are provided. Also, we discuss key factors that substantially affect the performance of the different methods.

Original languageEnglish (US)
Pages (from-to)178-189
Number of pages12
JournalBriefings in Bioinformatics
Volume20
Issue number1
DOIs
StatePublished - Jan 18 2019

Fingerprint

Agglomeration
Cells
Non-Small Cell Lung Carcinoma
Guidelines
Research

ASJC Scopus subject areas

  • Information Systems
  • Molecular Biology

Cite this

A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications. / Li, Xue; Wang, Xinlei; Xiao, Guanghua.

In: Briefings in Bioinformatics, Vol. 20, No. 1, 18.01.2019, p. 178-189.

Research output: Contribution to journalArticle

@article{3761827a1511438d82641f1c3b4c0b40,
title = "A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications",
abstract = "Rank aggregation (RA), the process of combining multiple ranked lists into a single ranking, has played an important role in integrating information from individual genomic studies that address the same biological question. In previous research, attention has been focused on aggregating full lists. However, partial and/or top ranked lists are prevalent because of the great heterogeneity of genomic studies and limited resources for follow-up investigation. To be able to handle such lists, some ad hoc adjustments have been suggested in the past, but how RA methods perform on them (after the adjustments) has never been fully evaluated. In this article, a systematic framework is proposed to define different situations that may occur based on the nature of individually ranked lists. A comprehensive simulation study is conducted to examine the performance characteristics of a collection of existing RA methods that are suitable for genomic applications under various settings simulated to mimic practical situations. A non-small cell lung cancer data example is provided for further comparison. Based on our numerical results, general guidelines about which methods perform the best/worst, and under what conditions, are provided. Also, we discuss key factors that substantially affect the performance of the different methods.",
author = "Xue Li and Xinlei Wang and Guanghua Xiao",
year = "2019",
month = "1",
day = "18",
doi = "10.1093/bib/bbx101",
language = "English (US)",
volume = "20",
pages = "178--189",
journal = "Briefings in Bioinformatics",
issn = "1467-5463",
publisher = "Oxford University Press",
number = "1",

}

TY - JOUR

T1 - A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications

AU - Li, Xue

AU - Wang, Xinlei

AU - Xiao, Guanghua

PY - 2019/1/18

Y1 - 2019/1/18

N2 - Rank aggregation (RA), the process of combining multiple ranked lists into a single ranking, has played an important role in integrating information from individual genomic studies that address the same biological question. In previous research, attention has been focused on aggregating full lists. However, partial and/or top ranked lists are prevalent because of the great heterogeneity of genomic studies and limited resources for follow-up investigation. To be able to handle such lists, some ad hoc adjustments have been suggested in the past, but how RA methods perform on them (after the adjustments) has never been fully evaluated. In this article, a systematic framework is proposed to define different situations that may occur based on the nature of individually ranked lists. A comprehensive simulation study is conducted to examine the performance characteristics of a collection of existing RA methods that are suitable for genomic applications under various settings simulated to mimic practical situations. A non-small cell lung cancer data example is provided for further comparison. Based on our numerical results, general guidelines about which methods perform the best/worst, and under what conditions, are provided. Also, we discuss key factors that substantially affect the performance of the different methods.

AB - Rank aggregation (RA), the process of combining multiple ranked lists into a single ranking, has played an important role in integrating information from individual genomic studies that address the same biological question. In previous research, attention has been focused on aggregating full lists. However, partial and/or top ranked lists are prevalent because of the great heterogeneity of genomic studies and limited resources for follow-up investigation. To be able to handle such lists, some ad hoc adjustments have been suggested in the past, but how RA methods perform on them (after the adjustments) has never been fully evaluated. In this article, a systematic framework is proposed to define different situations that may occur based on the nature of individually ranked lists. A comprehensive simulation study is conducted to examine the performance characteristics of a collection of existing RA methods that are suitable for genomic applications under various settings simulated to mimic practical situations. A non-small cell lung cancer data example is provided for further comparison. Based on our numerical results, general guidelines about which methods perform the best/worst, and under what conditions, are provided. Also, we discuss key factors that substantially affect the performance of the different methods.

UR - http://www.scopus.com/inward/record.url?scp=85060135194&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060135194&partnerID=8YFLogxK

U2 - 10.1093/bib/bbx101

DO - 10.1093/bib/bbx101

M3 - Article

VL - 20

SP - 178

EP - 189

JO - Briefings in Bioinformatics

T2 - Briefings in Bioinformatics

JF - Briefings in Bioinformatics

SN - 1467-5463

IS - 1

ER -