Exploiting Decoupled OpenCL Work-Items with Data Dependencies on FPGAs: A Case Study

2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW)

Exploiting Decoupled OpenCL Work-Items with Data Dependencies on FPGAs: A Case Study

Abstract: In the field of high performance heterogeneous computing systems, field programmable gate arrays (FPGAs) have shown great advantages in terms of acceleration and energy efficiency. And with the inclusion of the OpenCL framework for parallel programming, the design complexity has been greatly reduced. However, the parallel implementation of applications containing data-dependent branches usually experiences an important loss in performance, which affects all platforms alike. This data dependency leads the execution of parallel threads, also called work-items in OpenCL, to diverge. Whereas fixed architectures like CPU, GPU and Xeon Phi cannot efficiently cope with this divergent execution, the flexibility offered by FPGAs in terms of architecture can be exploited to tackle this problem.  In this work, we present a new approach for FPGA implementations that decouples the parallel OpenCL work-items, avoiding the interference of data-dependent branches between them. We also demonstrate the necessary workarounds to obtain the maximum performance in a pipelined design, when unpredictable for-loop exit conditions are caused by the data dependency. Furthermore, we show how to efficiently interleave computation with transfers to device global memory in each work-item. This approach is then evaluated with a real-life case study from Finance, with four different configurations implemented on FPGA with Xilinx SDAccel, and compared to the optimized implementation on CPU, GPU, and Xeon Phi. Our results show that FPGAs can deliver up to 5.5x speedup, whereas the systemlevel energy efficiency increases between 2x and 9.5x in all cases.

Author(s):  Varela, JA ; Wehn, N; Liang, Q; Tang, SY

2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW)

Pages: 124-131 Published: 2017

PDF: Exploiting Decoupled OpenCL Work-Items with Data Dependencies on FPGAs A Case Study

DOI: 10.1109/IPDPSW.2017.34

Sobre LAbI UFSCar 2838 Artigos
O Laboratório Aberto de Interatividade para Disseminação do Conhecimento Científico e Tecnológico (LAbI), vinculado à Universidade Federal de São Carlos (UFSCar), é voltado à prática da divulgação científica pautada na interatividade; nas relações entre Ciência, Arte e Tecnologia.