CLUtils is a library that aims to allow rapid prototyping of OpenCL
applications. It hides away all the boilerplate code necessary for establishing an OpenCL environment. It offers utilities that cover procedures like compiling source files and extracting kernels, setting up OpenCL-OpenGL interoperability, and profiling host and device code.
The simplest case handled by the library is one where a context is created for the first platform returned by the OpenCL API and a command queue is created for the first device in that platform. Programs are built for all devices in the associated context. This case is realized automatically when a kernel filename is provided on the definition of a CLEnv
instance. Omitting the kernel filename in the CLEnv constructor allows to later specify exactly the OpenCL configuration to set up.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <CLUtils.hpp>
using namespace clutils;
int main() {
CLEnv clEnv("myKernels.cl");
cl::Context &context(clEnv.getContext());
cl::CommandQueue &queue(clEnv.getQueue());
cl::Kernel &kernel(clEnv.getKernel("myKernel"));
cl::NDRange global(1024), local(256);
cl::Buffer dBuf(context, CL_MEM_READ_WRITE, 1024 * sizeof(int));
kernel.setArg(0, dBuf);
queue.enqueueNDRangeKernel(kernel, cl::NullRange, global, local);
return 0;
}
Profiling is enabled with the use of the GPUTimer
and ProfilingInfo
classes. GPUTimer does the actual timekeeping, and ProfilingInfo offers summarizing results on the profiling data.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <CLUtils.hpp>
using namespace clutils;
int main() {
CLEnv clEnv;
cl::Context &context(clEnv.addContext(0));
cl::CommandQueue &queue(
clEnv.addQueue(0, 0, CL_QUEUE_PROFILING_ENABLE));
cl::Kernel &kernel(clEnv.addProgram(0, "myKernels.cl", "myKernel"));
cl::NDRange global(1024), local(256);
cl::Buffer dBuf(context, CL_MEM_READ_WRITE, 1024 * sizeof(int));
kernel.setArg(0, dBuf);
GPUTimer<std::milli> gTimer(clEnv.devices[0][0]);
ProfilingInfo<10> pGPU("GPU");
for (int i = 0; i < 10; ++i) {
queue.enqueueNDRangeKernel(
kernel, cl::NullRange, global, local, nullptr, &gTimer.event());
gTimer.wait();
pGPU[i] = gTimer.duration();
}
pGPU.print("myKernel");
return 0;
}
Having defined all the necessary OpenCL entities in CLEnv, an OpenCL configuration can now be specified with CLEnvInfo
and communicated to a class that manages and executes some kind of algorithm.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <CLUtils.hpp>
using namespace clutils;
class PrefixSum;
int main() {
CLEnv clEnv("myKernels.cl");
cl::Context &context(clEnv.getContext());
cl::CommandQueue &queue(clEnv.getQueue());
CLEnvInfo<1> info (0, 0, 0, { 0 }, 0);
PrefixSum scan(clEnv, info);
scan.run();
return 0;
}
You can find the complete documentation at clutils.nlamprian.me. The source code is available on GitHub.