Immune Repertoire Analysis on High-Performance Computing Using VDJServer V1: A Method by the AIRR Community

on behalf of the AIRR Community

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

AIRR-seq data sets are usually large and require specialized analysis methods and software tools. A typical Illumina MiSeq sequencing run generates 20–30 million 2 × 300 bp paired-end sequence reads, which roughly corresponds to 15 GB of sequence data to be processed. Other platforms like NextSeq, which is useful in projects where the full V gene is not needed, create about 400 million 2 × 150 bp paired-end reads. Because of the size of the data sets, the analysis can be computationally expensive, particularly the early analysis steps like preprocessing and gene annotation that process the majority of the sequence data. A standard desktop PC may take 3–5 days of constant processing for a single MiSeq run, so dedicated high-performance computational resources may be required. VDJServer provides free access to high-performance computing (HPC) at the Texas Advanced Computing Center (TACC) through a graphical user interface (Christley et al. Front Immunol 9:976, 2018). VDJServer is a cloud-based analysis portal for immune repertoire sequence data that provides access to a suite of tools for a complete analysis workflow, including modules for preprocessing and quality control of sequence reads, V(D)J gene assignment, repertoire characterization, and repertoire comparison. Furthermore, VDJServer has parallelized execution for tools such as IgBLAST, so more compute resources are utilized as the size of the input data grows. Analysis that takes days on a desktop PC might take only a few hours on VDJServer. VDJServer is a free, publicly available, and open-source licensed resource. Here, we describe the workflow for performing immune repertoire analysis on VDJServer’s high-performance computing.

Original languageEnglish (US)
Title of host publicationMethods in Molecular Biology
PublisherHumana Press Inc.
Pages439-446
Number of pages8
DOIs
StatePublished - 2022

Publication series

NameMethods in Molecular Biology
Volume2453
ISSN (Print)1064-3745
ISSN (Electronic)1940-6029

Keywords

  • AIRR-Seq
  • B-cell receptor
  • Cloud computing
  • High-performance computing
  • T-cell receptor

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics

Fingerprint

Dive into the research topics of 'Immune Repertoire Analysis on High-Performance Computing Using VDJServer V1: A Method by the AIRR Community'. Together they form a unique fingerprint.

Cite this