Ultrafast convolution/superposition using tabulated and exponential kernels on GPU

Quan Chen, Mingli Chen, Weiguo Lu

Research output: Contribution to journalArticle

27 Citations (Scopus)

Abstract

Purpose: Collapsed-cone convolution/superposition (CCCS) dose calculation is the workhorse for IMRT dose calculation. The authors present a novel algorithm for computing CCCS dose on the modern graphic processing unit (GPU). Methods: The GPU algorithm includes a novel TERMA calculation that has no write-conflicts and has linear computation complexity. The CCCS algorithm uses either tabulated or exponential cumulative-cumulative kernels (CCKs) as reported in literature. The authors have demonstrated that the use of exponential kernels can reduce the computation complexity by order of a dimension and achieve excellent accuracy. Special attentions are paid to the unique architecture of GPU, especially the memory accessing pattern, which increases performance by more than tenfold. Results: As a result, the tabulated kernel implementation in GPU is two to three times faster than other GPU implementations reported in literature. The implementation of CCCS showed significant speedup on GPU over single core CPU. On tabulated CCK, speedups as high as 70 are observed; on exponential CCK, speedups as high as 90 are observed. Conclusions: Overall, the GPU algorithm using exponential CCK is 1000-3000 times faster over a highly optimized single-threaded CPU implementation using tabulated CCK, while the dose differences are within 0.5% and 0.5 mm. This ultrafast CCCS algorithm will allow many time-sensitive applications to use accurate dose calculation.

Original languageEnglish (US)
Pages (from-to)1150-1161
Number of pages12
JournalMedical Physics
Volume38
Issue number3
DOIs
StatePublished - Mar 2011

Fingerprint

Conflict (Psychology)

Keywords

  • convolution superposition
  • dose calculation
  • exponential kernels
  • GPU
  • tabulated kernels
  • treatment planning

ASJC Scopus subject areas

  • Biophysics
  • Radiology Nuclear Medicine and imaging

Cite this

Ultrafast convolution/superposition using tabulated and exponential kernels on GPU. / Chen, Quan; Chen, Mingli; Lu, Weiguo.

In: Medical Physics, Vol. 38, No. 3, 03.2011, p. 1150-1161.

Research output: Contribution to journalArticle

Chen, Quan ; Chen, Mingli ; Lu, Weiguo. / Ultrafast convolution/superposition using tabulated and exponential kernels on GPU. In: Medical Physics. 2011 ; Vol. 38, No. 3. pp. 1150-1161.
@article{bb17a0fd94ad40768d3cdc2452505ba5,
title = "Ultrafast convolution/superposition using tabulated and exponential kernels on GPU",
abstract = "Purpose: Collapsed-cone convolution/superposition (CCCS) dose calculation is the workhorse for IMRT dose calculation. The authors present a novel algorithm for computing CCCS dose on the modern graphic processing unit (GPU). Methods: The GPU algorithm includes a novel TERMA calculation that has no write-conflicts and has linear computation complexity. The CCCS algorithm uses either tabulated or exponential cumulative-cumulative kernels (CCKs) as reported in literature. The authors have demonstrated that the use of exponential kernels can reduce the computation complexity by order of a dimension and achieve excellent accuracy. Special attentions are paid to the unique architecture of GPU, especially the memory accessing pattern, which increases performance by more than tenfold. Results: As a result, the tabulated kernel implementation in GPU is two to three times faster than other GPU implementations reported in literature. The implementation of CCCS showed significant speedup on GPU over single core CPU. On tabulated CCK, speedups as high as 70 are observed; on exponential CCK, speedups as high as 90 are observed. Conclusions: Overall, the GPU algorithm using exponential CCK is 1000-3000 times faster over a highly optimized single-threaded CPU implementation using tabulated CCK, while the dose differences are within 0.5{\%} and 0.5 mm. This ultrafast CCCS algorithm will allow many time-sensitive applications to use accurate dose calculation.",
keywords = "convolution superposition, dose calculation, exponential kernels, GPU, tabulated kernels, treatment planning",
author = "Quan Chen and Mingli Chen and Weiguo Lu",
year = "2011",
month = "3",
doi = "10.1118/1.3551996",
language = "English (US)",
volume = "38",
pages = "1150--1161",
journal = "Medical Physics",
issn = "0094-2405",
publisher = "AAPM - American Association of Physicists in Medicine",
number = "3",

}

TY - JOUR

T1 - Ultrafast convolution/superposition using tabulated and exponential kernels on GPU

AU - Chen, Quan

AU - Chen, Mingli

AU - Lu, Weiguo

PY - 2011/3

Y1 - 2011/3

N2 - Purpose: Collapsed-cone convolution/superposition (CCCS) dose calculation is the workhorse for IMRT dose calculation. The authors present a novel algorithm for computing CCCS dose on the modern graphic processing unit (GPU). Methods: The GPU algorithm includes a novel TERMA calculation that has no write-conflicts and has linear computation complexity. The CCCS algorithm uses either tabulated or exponential cumulative-cumulative kernels (CCKs) as reported in literature. The authors have demonstrated that the use of exponential kernels can reduce the computation complexity by order of a dimension and achieve excellent accuracy. Special attentions are paid to the unique architecture of GPU, especially the memory accessing pattern, which increases performance by more than tenfold. Results: As a result, the tabulated kernel implementation in GPU is two to three times faster than other GPU implementations reported in literature. The implementation of CCCS showed significant speedup on GPU over single core CPU. On tabulated CCK, speedups as high as 70 are observed; on exponential CCK, speedups as high as 90 are observed. Conclusions: Overall, the GPU algorithm using exponential CCK is 1000-3000 times faster over a highly optimized single-threaded CPU implementation using tabulated CCK, while the dose differences are within 0.5% and 0.5 mm. This ultrafast CCCS algorithm will allow many time-sensitive applications to use accurate dose calculation.

AB - Purpose: Collapsed-cone convolution/superposition (CCCS) dose calculation is the workhorse for IMRT dose calculation. The authors present a novel algorithm for computing CCCS dose on the modern graphic processing unit (GPU). Methods: The GPU algorithm includes a novel TERMA calculation that has no write-conflicts and has linear computation complexity. The CCCS algorithm uses either tabulated or exponential cumulative-cumulative kernels (CCKs) as reported in literature. The authors have demonstrated that the use of exponential kernels can reduce the computation complexity by order of a dimension and achieve excellent accuracy. Special attentions are paid to the unique architecture of GPU, especially the memory accessing pattern, which increases performance by more than tenfold. Results: As a result, the tabulated kernel implementation in GPU is two to three times faster than other GPU implementations reported in literature. The implementation of CCCS showed significant speedup on GPU over single core CPU. On tabulated CCK, speedups as high as 70 are observed; on exponential CCK, speedups as high as 90 are observed. Conclusions: Overall, the GPU algorithm using exponential CCK is 1000-3000 times faster over a highly optimized single-threaded CPU implementation using tabulated CCK, while the dose differences are within 0.5% and 0.5 mm. This ultrafast CCCS algorithm will allow many time-sensitive applications to use accurate dose calculation.

KW - convolution superposition

KW - dose calculation

KW - exponential kernels

KW - GPU

KW - tabulated kernels

KW - treatment planning

UR - http://www.scopus.com/inward/record.url?scp=79952124976&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952124976&partnerID=8YFLogxK

U2 - 10.1118/1.3551996

DO - 10.1118/1.3551996

M3 - Article

C2 - 21520827

AN - SCOPUS:79952124976

VL - 38

SP - 1150

EP - 1161

JO - Medical Physics

JF - Medical Physics

SN - 0094-2405

IS - 3

ER -