HISAT: A fast spliced aligner with low memory requirements

Daehwan Kim, Ben Langmead, Steven L. Salzberg

Research output: Contribution to journalArticle

2214 Citations (Scopus)

Abstract

HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of 1/464,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.

Original languageEnglish (US)
Pages (from-to)357-360
Number of pages4
JournalNature Methods
Volume12
Issue number4
DOIs
StatePublished - Jan 1 2015

Fingerprint

RNA Sequence Analysis
Genome Size
Human Genome
Genome
Data storage equipment
Genes
Datasets
Anchors
RNA

ASJC Scopus subject areas

  • Biotechnology
  • Biochemistry
  • Molecular Biology
  • Cell Biology

Cite this

HISAT : A fast spliced aligner with low memory requirements. / Kim, Daehwan; Langmead, Ben; Salzberg, Steven L.

In: Nature Methods, Vol. 12, No. 4, 01.01.2015, p. 357-360.

Research output: Contribution to journalArticle

Kim, Daehwan ; Langmead, Ben ; Salzberg, Steven L. / HISAT : A fast spliced aligner with low memory requirements. In: Nature Methods. 2015 ; Vol. 12, No. 4. pp. 357-360.
@article{4aa15645e82b4194a4c68d3ac7753249,
title = "HISAT: A fast spliced aligner with low memory requirements",
abstract = "HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of 1/464,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.",
author = "Daehwan Kim and Ben Langmead and Salzberg, {Steven L.}",
year = "2015",
month = "1",
day = "1",
doi = "10.1038/nmeth.3317",
language = "English (US)",
volume = "12",
pages = "357--360",
journal = "Nature Methods",
issn = "1548-7091",
publisher = "Public Library of Science",
number = "4",

}

TY - JOUR

T1 - HISAT

T2 - A fast spliced aligner with low memory requirements

AU - Kim, Daehwan

AU - Langmead, Ben

AU - Salzberg, Steven L.

PY - 2015/1/1

Y1 - 2015/1/1

N2 - HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of 1/464,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.

AB - HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of 1/464,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.

UR - http://www.scopus.com/inward/record.url?scp=84926519013&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84926519013&partnerID=8YFLogxK

U2 - 10.1038/nmeth.3317

DO - 10.1038/nmeth.3317

M3 - Article

C2 - 25751142

AN - SCOPUS:84926519013

VL - 12

SP - 357

EP - 360

JO - Nature Methods

JF - Nature Methods

SN - 1548-7091

IS - 4

ER -