A high-performance accelerator for floating-point matrix multiplication

Xun Jia, Guiming Wu, Xianghui Xie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Matrix multiplication is a widely-used routine in science and engineering applications. Accelerating this routine is important, because applications with large-scale matrix multiplication are increasingly common, especially in the area of high-performance computing (HPC). However, existing computing platforms including CPU, GPGPU and FPGA suffer from unsatisfactory performance or efficiency for this routine. In this paper, we propose a high-performance accelerator for double-precision floating-point matrix multiplication, and build a performance model for design space exploration based on a memory access scheduling. Impact of architecture parameters on accelerator performance and efficiency are evaluated and analyzed. Experimental results show that our proposed accelerator with 256 processing elements (PEs) can achieve a maximum performance of 767.99 GFLOPS and an efficiency of 99.99% for large-scale matrix multiplication, which is well suited to the requirement of HPC applications.

Original languageEnglish (US)
Title of host publicationProceedings - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages396-402
Number of pages7
ISBN (Electronic)9781538637906
DOIs
StatePublished - May 25 2018
Event15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017 - Guangzhou, China
Duration: Dec 12 2017Dec 15 2017

Other

Other15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017
CountryChina
CityGuangzhou
Period12/12/1712/15/17

Keywords

  • Accelerator
  • Architecture
  • High-performance
  • Linear array
  • Matrix multiplication

ASJC Scopus subject areas

  • Computer Science Applications
  • Hardware and Architecture
  • Information Systems
  • Control and Optimization
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'A high-performance accelerator for floating-point matrix multiplication'. Together they form a unique fingerprint.

  • Cite this

    Jia, X., Wu, G., & Xie, X. (2018). A high-performance accelerator for floating-point matrix multiplication. In Proceedings - 15th IEEE International Symposium on Parallel and Distributed Processing with Applications and 16th IEEE International Conference on Ubiquitous Computing and Communications, ISPA/IUCC 2017 (pp. 396-402). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISPA/IUCC.2017.00063