CLUtils

Library that offers utilities which help setup and manage an OpenCL environment

Published on January 2, 2015

Categories: software, parallel, opencl

Tags: c++, library, utilities, profiling, opencl

Website: https://github.com/nlamprian/CLUtils

CLUtils is a library that aims to allow rapid prototyping of OpenCL applications. It hides away all the boilerplate code necessary for establishing an OpenCL environment. It offers utilities that cover procedures like compiling source files and extracting kernels, setting up OpenCL-OpenGL interoperability, and profiling host and device code.

The simplest case handled by the library is one where a context is created for the first platform returned by the OpenCL API and a command queue is created for the first device in that platform. Programs are built for all devices in the associated context. This case is realized automatically when a kernel filename is provided on the definition of a CLEnv instance. Omitting the kernel filename in the CLEnv constructor allows to later specify exactly the OpenCL configuration to set up.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <CLUtils.hpp>

using namespace clutils;

int main() {
  CLEnv clEnv("myKernels.cl");
  cl::Context &context(clEnv.getContext());
  cl::CommandQueue &queue(clEnv.getQueue());
  cl::Kernel &kernel(clEnv.getKernel("myKernel"));
  cl::NDRange global(1024), local(256);

  cl::Buffer dBuf(context, CL_MEM_READ_WRITE, 1024 * sizeof(int));
  kernel.setArg(0, dBuf);

  queue.enqueueNDRangeKernel(kernel, cl::NullRange, global, local);

  return 0;
}

Profiling is enabled with the use of the GPUTimer and ProfilingInfo classes. GPUTimer does the actual timekeeping, and ProfilingInfo offers summarizing results on the profiling data.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <CLUtils.hpp>

using namespace clutils;

int main() {
  CLEnv clEnv;
  cl::Context &context(clEnv.addContext(0));
  cl::CommandQueue &queue(
      clEnv.addQueue(0, 0, CL_QUEUE_PROFILING_ENABLE));
  cl::Kernel &kernel(clEnv.addProgram(0, "myKernels.cl", "myKernel"));
  cl::NDRange global(1024), local(256);

  cl::Buffer dBuf(context, CL_MEM_READ_WRITE, 1024 * sizeof(int));
  kernel.setArg(0, dBuf);

  GPUTimer<std::milli> gTimer(clEnv.devices[0][0]);
  ProfilingInfo<10> pGPU("GPU");

  for (int i = 0; i < 10; ++i) {
    queue.enqueueNDRangeKernel(
        kernel, cl::NullRange, global, local, nullptr, &gTimer.event());
    gTimer.wait();
    pGPU[i] = gTimer.duration();
  }

  pGPU.print("myKernel");

  return 0;
}

Having defined all the necessary OpenCL entities in CLEnv, an OpenCL configuration can now be specified with CLEnvInfo and communicated to a class that manages and executes some kind of algorithm.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <CLUtils.hpp>

using namespace clutils;

class PrefixSum;

int main() {
  CLEnv clEnv("myKernels.cl");
  cl::Context &context(clEnv.getContext());
  cl::CommandQueue &queue(clEnv.getQueue());

  CLEnvInfo<1> info (0, 0, 0, { 0 }, 0);
  PrefixSum scan(clEnv, info);

  scan.run();

  return 0;
}

You can find the complete documentation at clutils.nlamprian.me. The source code is available on GitHub.