Skip to content

OpenCL

Programming FPGAs follows a heterogeneous approach involving code development for both the host and kernel. Within XRT, the host code can be written in C, C++, or Python, and it interacts with kernels using the XRT library functions or OpenCL. This workshop will use the OpenCL library and C++ to write the host program.

OpenCL (Open Computing Language) is a framework for programming heterogeneous computing systems, allowing developers to write code that can execute across different types of processing units, such as CPUs, GPUs, and FPGAs. OpenCL and HLS enable the development of FPGA kernels using a C-based language, allowing for a more accessible and portable approach to harnessing the performance benefits of FPGA acceleration in a wide range of applications, from scientific computing to machine learning.

OpenCL concepts

Platform and devices

The OpenCL programming model is structured around a platform, encapsulating the entire hardware and software environment and devices and the specific compute units available within that platform. These devices span a spectrum of hardware types, such as Graphics Processing Units (GPUs) and Field-Programmable Gate Arrays (FPGAs). This inherent versatility makes OpenCL a powerful tool for developers harnessing parallel computing across diverse architectures. The platform-device paradigm allows for the creation of applications that can seamlessly adapt to and leverage the unique capabilities of various computing devices. The following image illustrates the OpenCL platform-device paradigm.

Alt Text

Context and queues

In the OpenCL programming model, a context is a crucial abstraction that serves as a container for various resources, including devices, memory objects, and program objects. The context defines the execution environment for kernels, acting as a host for coordinating and managing computation tasks. It ensures that data and operations are correctly synchronized among different devices within the context.

In tandem with the context, OpenCL queues provide a mechanism for submitting and managing commands for execution on devices associated with the context. A queue represents a sequence of commands that are scheduled to be processed by a specific device within the context. Commands include tasks such as memory transfers, kernel executions, and synchronization points. Using queues, developers can orchestrate the execution flow of commands, enabling efficient parallel processing across multiple devices. The context and queues collectively form the backbone of the OpenCL runtime system, facilitating the seamless coordination of computations across diverse hardware resources.

Kernel

In OpenCL, a kernel is a fundamental unit of computation, representing a parallelizable task designed to run on OpenCL devices. In the context of GPUs, a kernel is a code segment executed by a processing element, defining the parallelized task to be performed. In FPGA development with XRT, a kernel represents the functionality of a compute unit synthesized on the FPGA fabric. Unlike traditional methods requiring separate design of compute units and peripherals/memory communication units, XRT simplifies this by providing interfaces to peripherals and memory. Host code adjustments enable the instantiation of multiple compute units with a single kernel or the use of different kernels representing diverse computational units within a unified host program.

Buffers

Last but not least, buffers are essential for managing data transfers between the host and compute devices. Buffers serve as containers for input and output data kernels used during computation. These memory spaces are allocated and managed by the host, providing a means for efficient data sharing between the host and the device. The host can write data to buffers, which are then passed to the device for processing by kernels. Once the computation is complete, the results are transferred back to the host through these buffers. Efficient buffer management is crucial for optimizing data movement and overall performance in heterogeneous computing environments.