Hello everyone,

I’m changing into de SYCL world, but I’ve been having some compiling issues that are making me think to return to traditional OpenCL.

I tried with this code:

#include <iostream>
#include <CL/sycl.hpp>

class vector_addition;

int main(int, char**) {

try
{
   cl::sycl::float4 a = { 1.0, 2.0, 3.0, 4.0 };
   cl::sycl::float4 b = { 4.0, 3.0, 2.0, 1.0 };
   cl::sycl::float4 c = { 0.0, 0.0, 0.0, 0.0 };

   cl::sycl::default_selector device_selector;

   cl::sycl::queue queue(device_selector);
   std::cout << "Running on "
             << queue.get_device().get_info<cl::sycl::info::device::name>()
             << "\n";
   {
      cl::sycl::buffer<cl::sycl::float4, 1> a_sycl(&a, cl::sycl::range<1>(1));
      cl::sycl::buffer<cl::sycl::float4, 1> b_sycl(&b, cl::sycl::range<1>(1));
      cl::sycl::buffer<cl::sycl::float4, 1> c_sycl(&c, cl::sycl::range<1>(1));

      queue.submit([&] (cl::sycl::handler& cgh) {
         auto a_acc = a_sycl.get_access<cl::sycl::access::mode::read>(cgh);
         auto b_acc = b_sycl.get_access<cl::sycl::access::mode::read>(cgh);
         auto c_acc = c_sycl.get_access<cl::sycl::access::mode::discard_write>(cgh);

         cgh.single_task<class vector_addition>([=] () {
         c_acc[0] = a_acc[0] + b_acc[0];
         });
      });
   }
   std::cout << "  A { " << a.x() << ", " << a.y() << ", " << a.z() << ", " << a.w() << " }\n"
        << "+ B { " << b.x() << ", " << b.y() << ", " << b.z() << ", " << b.w() << " }\n"
        << "------------------\n"
        << "= C { " << c.x() << ", " << c.y() << ", " << c.z() << ", " << c.w() << " }"
        << std::endl;
}
catch(std::exception& ex)
{
        std::cerr << "exception caught: " << ex.what() << std::endl;
    return 1;
}

   return 0;
}

and I compiled with:
/usr/local/computecpp/bin/compute++ hello_world.cpp -o HELLO_SYCL -I /usr/local/computecpp/include -lOpenCL -L /usr/local/computecpp/lib/ -lComputeCpp

And I obtained:

Running on GeForce GTX 1060 with Max-Q Design
terminate called after throwing an instance of ‘cl::sycl::invalid_object_error’
Aborted (core dumped)

I noticed that the code recongnized my graphic card. I read some about Bitcode targets, would it be my issue?

I need to say that I couldn’t run the samples of the github page either

Thanks!

To give some background, NVIDIA support with ComputeCpp has been “experimental” for a while. We’ve always focused on providing support for devices that support SPIR/SPIR-V instructions which makes compilation simpler. NVIDIA does not support SPIR, so we implemented a PTX back-end to enable this. However this means you need to tell the compiler to output the correct instruction set for the architecture.

For SYCL code with NVIDIA hardware, we recently made a contribution to the DPC++ compiler project, and I’d recommend you take a look at that.

To fix this, when compiling for ComputeCpp you need to tell compute++ to output a binary with ptx instructions for your NVIDIA device, by default it outputs SPIR which as I say NVIDIA does not support (There’s a specific NVIDIA guide for this, you can follow the Getting Started guide and use the flag specified in those instructions.)
Our CMake files with the SDK make this easier to pass in this flag, but if you want to invoke the compiler directly you need to pass the command like this (note I can’t test this as I don’t have access to a NVIDIA machine currently):

/usr/local/computecpp/bin/compute++ hello_world.cpp -o HELLO_SYCL -sycl -sycl-target ptx64 -I /usr/local/computecpp/include -lOpenCL -L /usr/local/computecpp/lib/ -lComputeCpp 
1 Like

Hi Rod!
I also got same error with him when I tried to run same sample code.
But my environment is little bit different with him.
I tried to compile it with “-sycl -sycl-target ptx64”, however it didn’t finish.

Could you tell me how to compile and run this code?
My commands and its result are as follows,

tokyo@ryota-virtualbox:~/SYCL sample code$ /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/bin/compute++ -std=c++11 -I/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/ -L/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/lib -lComputeCpp -I/home/tokyo/computecpp-sdk-master/include -L/home/tokyo/pocl-1.5/build/lib/CL -lpocl hello_world.cpp -o hello_world
tokyo@ryota-virtualbox:~/SYCL sample code$ ./hello_world
Running on pthread-Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
terminate called after throwing an instance of ‘cl::sycl::invalid_object_error’
Aborted (core dumped)

Hello Rod, I could’t compile even when I change the device selector with a CPU.

Hello both, thanks for the further info, I realise this is less about the target and actually more likely the command you are using to invoke the compiler. I’ve been thinking for a while that we need to provide a guide for invoking compute++ directly. Our expectation/recommendation has been that people use the CMake setup that we make available with the computecpp-sdk samples on GitHub. This finds all the files and also sets the most appropriate flags. But I suppose not everyone wants to use that.

The command you are using does not tell the compiler to generate any device code.

Try using this command for running on an Intel CPU for example (modify depending on your paths):
compute++ -sycl-driver -sycl-target spirv64 test.cpp -I computecpp_ce_computecpp-ce-1.3.0-ubuntu.16.04-64bit/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64//include/ -lOpenCL -L computecpp_ce_computecpp-ce-1.3.0-ubuntu.16.04-64bit/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/lib/ -lComputeCpp -IOpenCL-Headers/

For a PTX target this should work (I don’t currently have a suitable machine to test on):
computecpp_ce_computecpp-ce-1.3.0-ubuntu.16.04-64bit/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/bin/compute++ -sycl-driver -sycl-target ptx64 test.cpp -I computecpp_ce_computecpp-ce-1.3.0-ubuntu.16.04-64bit/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64//include/ -lOpenCL -L computecpp_ce_computecpp-ce-1.3.0-ubuntu.16.04-64bit/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/lib/ -lComputeCpp -IOpenCL-Headers/

1 Like

Rod! YOU ARE THE BEST!!! Thank you very much!!! Finally I can run my SYCL code! I did exactly what you mentioned above!!!

P.D. I’m going to start to create my own make files

1 Like

Thank you for your reply.
I tried some commands referring to your advice.
And still I can’t run the compiled file. But the described error was changed from ‘cl::sycl::invalid_object_error’ to ‘cl::sycl::compile_program_error’.

FYI, I need to make ComputeCpp SYCL environment on POCL OpenCL platform.

//1st command(added -sycl-driver -sycl-target spirv64)
tokyo@ryota-virtualbox:~/SYCL sample code$ /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/bin/compute++ -sycl-driver -sycl-target spirv64 hello_world.cpp -I/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/ -L/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/lib -lComputeCpp -I/home/tokyo/computecpp-sdk-master/include -L/home/tokyo/pocl-1.5/build/lib/CL -lpocl -o hello_sycl
remark: [Computecpp:CC0027]: Some memcpy/memset intrinsics added by the llvm optimizer were replaced by serial functions. This is a
workaround for OpenCL drivers that do not support those intrinsics. This may impact performance, consider using
-no-serial-memop. [-Rsycl-serial-memop]
tokyo@ryota-virtualbox:~/SYCL sample code$ ls
hello_sycl hello_world hello_world.cpp vectorops3 vectorops3.cpp whichever whichever.cpp
tokyo@ryota-virtualbox:~/SYCL sample code$ ./hello_sycl
Running on pthread-Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
terminate called after throwing an instance of ‘cl::sycl::compile_program_error’
Aborted (core dumped)

2nd command(added 1st + /OpenCL-Headers)
tokyo@ryota-virtualbox:~/SYCL sample code$ /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/bin/compute++ -sycl-driver -sycl-target spirv64 hello_world.cpp -I/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/ -L/home/tokyo/pocl-1.5/build/lib/CL -lpocl -L/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/lib -lComputeCpp -I/home/tokyo/computecpp-sdk-master/include -I/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers -o hello_sycl
In file included from hello_world.cpp:2:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/CL/sycl.hpp:1:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/CL/…/SYCL/sycl.hpp:20:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/sycl_builtins.h:27:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/cpp_to_cl_cast.h:12:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/deduce.h:25:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/cl_types.h:23:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/common.h:21:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/include_opencl.h:34:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers/CL/cl.h:20:
/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers/CL/cl_version.h:22:9: warning: cl_version.h:
CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2) [-W#pragma-messages]
#pragma message(“cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)”)
^
remark: [Computecpp:CC0027]: Some memcpy/memset intrinsics added by the llvm optimizer were replaced by serial functions. This is a
workaround for OpenCL drivers that do not support those intrinsics. This may impact performance, consider using
-no-serial-memop. [-Rsycl-serial-memop]
1 warning generated.
In file included from :1:
In file included from /tmp/hello_world-6d3692.sycl:12:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/sycl_ih.hpp:20:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/accessor_host_args.h:18:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/base.h:19:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/common.h:21:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/include_opencl.h:34:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers/CL/cl.h:20:
/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers/CL/cl_version.h:22:9: warning: cl_version.h:
CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2) [-W#pragma-messages]
#pragma message(“cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)”)
^
1 warning generated.
tokyo@ryota-virtualbox:~/SYCL sample code$ ./hello_sycl
Running on pthread-Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
terminate called after throwing an instance of ‘cl::sycl::compile_program_error’
Aborted (core dumped)

//FYI result of clinfo
tokyo@ryota-virtualbox:~/SYCL sample code$ clinfo
Number of platforms 1
Platform Name Portable Computing Language
Platform Vendor The pocl project
Platform Version OpenCL 1.2 pocl 1.5, RelWithDebInfo, LLVM 6.0.0, RELOC, SPIR, SLEEF, POCL_DEBUG
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix POCL
Platform Name Portable Computing Language
Number of devices 1
Device Name pthread-Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
Device Vendor GenuineIntel
Device Vendor ID 0x6c636f70
Device Version OpenCL 1.2 pocl HSTR: pthread-x86_64-pc-linux-gnu-skylake
Driver Version 1.5
Device OpenCL C Version OpenCL C 1.2 pocl
Device Type CPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 1
Max clock frequency 1497MHz
Device Partition (core)
Max number of sub-devices 1
Supported partition types equally, by counts
Max work item dimensions 3
Max work item sizes 4096x4096x4096
Max work group size 4096
Preferred work group size multiple 8
Preferred / native vector sizes
char 16 / 16
short 16 / 16
int 8 / 8
long 4 / 4
half 0 / 0 (n/a)
float 8 / 8
double 4 / 4 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 1563248640 (1.456GiB)
Error Correction support No
Max memory allocation 536870912 (512MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type None
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 33554432 pixels
Max 1D or 2D image array size 2048 images
Max 2D image size 8192x8192 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 128
Local memory type Global
Local memory size 8388608 (8MiB)
Max number of constant args 8
Max constant buffer size 8388608 (8MiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution Yes
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels Yes
SPIR versions 1.2
printf() buffer size 16777216 (16MiB)
Built-in kernels
Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_spir cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, …) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, …) No platform
clCreateContext(NULL, …) [default] No platform
clCreateContext(NULL, …) [other] Success [POCL]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Portable Computing Language
Device Name pthread-Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) Success (1)
Platform Name Portable Computing Language
Device Name pthread-Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Portable Computing Language
Device Name pthread-Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz

Hello, just for curiosity… what happens if you just remove the -lpocl stuff?

I tried one more command also. However the result was same with 2nd one.

//3rd command(changed 2nd one’s “-L/home/tokyo/pocl-1.5/build/lib/CL -lpocl” to “-lOpenCL”)
Tokyo@ryota-virtualbox:~/SYCL sample code$ /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/bin/compute++ -sycl-driver -sycl-target spirv64 hello_world.cpp -I/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/ -lOpenCL -L/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/lib -lComputeCpp -I/home/tokyo/computecpp-sdk-master/include -I/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers -o hello_sycl
In file included from hello_world.cpp:2:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/CL/sycl.hpp:1:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/CL/…/SYCL/sycl.hpp:20:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/sycl_builtins.h:27:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/cpp_to_cl_cast.h:12:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/deduce.h:25:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/cl_types.h:23:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/common.h:21:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/include_opencl.h:34:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers/CL/cl.h:20:
/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers/CL/cl_version.h:22:9: warning: cl_version.h:
CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2) [-W#pragma-messages]
#pragma message(“cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)”)
^
remark: [Computecpp:CC0027]: Some memcpy/memset intrinsics added by the llvm optimizer were replaced by serial functions. This is a
workaround for OpenCL drivers that do not support those intrinsics. This may impact performance, consider using
-no-serial-memop. [-Rsycl-serial-memop]
1 warning generated.
In file included from :1:
In file included from /tmp/hello_world-57738c.sycl:12:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/sycl_ih.hpp:20:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/accessor_host_args.h:18:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/base.h:19:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/common.h:21:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/SYCL/include_opencl.h:34:
In file included from /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers/CL/cl.h:20:
/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers/CL/cl_version.h:22:9: warning: cl_version.h:
CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2) [-W#pragma-messages]
#pragma message(“cl_version.h: CL_TARGET_OPENCL_VERSION is not defined. Defaulting to 220 (OpenCL 2.2)”)
^
1 warning generated.
tokyo@ryota-virtualbox:~/SYCL sample code$ ./hello_sycl
Running on Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
terminate called after throwing an instance of ‘cl::sycl::compile_program_error’
Aborted (core dumped)

//clinfo result when I tried 3rd command. I tried to changed plat form for 3rd command.
tokyo@ryota-virtualbox:~/SYCL sample code$ clinfo
Number of platforms 1
Platform Name Intel(R) CPU Runtime for OpenCL™ Applications
Platform Vendor Intel(R) Corporation
Platform Version OpenCL 2.1 LINUX
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_fp64 cl_khr_image2d_from_buffer cl_intel_vec_len_hint
Platform Host timer resolution 1ns
Platform Extensions function suffix INTEL
Platform Name Intel(R) CPU Runtime for OpenCL™ Applications
Number of devices 1
Device Name Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
Device Vendor Intel(R) Corporation
Device Vendor ID 0x8086
Device Version OpenCL 2.1 (Build 0)
Driver Version 18.1.0.0920
Device OpenCL C Version OpenCL C 2.0
Device Type CPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 1
Max clock frequency 1200MHz
Device Partition (core)
Max number of sub-devices 1
Supported partition types by counts, equally, by names (Intel)
Max work item dimensions 3
Max work item sizes 8192x8192x8192
Max work group size 8192
Preferred work group size multiple 128
Max sub-groups per work group 1
Preferred / native vector sizes
char 1 / 32
short 1 / 16
int 1 / 8
long 1 / 4
half 0 / 0 (n/a)
float 1 / 8
double 1 / 4 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 2084331520 (1.941GiB)
Error Correction support No
Max memory allocation 521082880 (496.9MiB)
Unified memory for Host and Device Yes
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing Yes
Fine-grained buffer sharing Yes
Fine-grained system sharing Yes
Atomics Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Preferred alignment for atomics
SVM 64 bytes
Global 64 bytes
Local 0 bytes
Max size for global variable 65536 (64KiB)
Preferred total size of global vars 65536 (64KiB)
Global Memory cache type Read/Write
Global Memory cache size 262144 (256KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 480
Max size for 1D images from buffer 32567680 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 64 bytes
Pitch alignment for 2D image buffers 64 pixels
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 480
Max number of write image args 480
Max number of read/write image args 480
Max number of pipe args 16
Max active pipe reservations 262143
Max pipe packet size 1024
Local memory type Global
Local memory size 32768 (32KiB)
Max number of constant args 480
Max constant buffer size 131072 (128KiB)
Max size of kernel argument 3840 (3.75KiB)
Queue properties (on host)
Out-of-order execution Yes
Profiling Yes
Local thread execution (Intel) Yes
Queue properties (on device)
Out-of-order execution Yes
Profiling Yes
Preferred size 4294967295 (4GiB)
Max size 4294967295 (4GiB)
Max queues on device 4294967295
Max events on device 4294967295
Prefer user sync for interop No
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels Yes
Sub-group independent forward progress No
IL version SPIR-V_1.0
SPIR versions 1.2
printf() buffer size 1048576 (1024KiB)
Built-in kernels
Device Extensions cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_fp64 cl_khr_image2d_from_buffer cl_intel_vec_len_hint
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, …) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, …) No platform
clCreateContext(NULL, …) [default] No platform
clCreateContext(NULL, …) [other] Success [INTEL]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Intel(R) CPU Runtime for OpenCL™ Applications
Device Name Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) Success (1)
Platform Name Intel(R) CPU Runtime for OpenCL™ Applications
Device Name Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Intel(R) CPU Runtime for OpenCL™ Applications
Device Name Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
NOTE: your OpenCL library only supports OpenCL 2.0,
but some installed platforms support OpenCL 2.1.
Programs using 2.1 features may crash
or behave unexepectedly

Hi Juan. I just tried following command also.
However the result was same with my 2nd and 3rd.

tokyo@ryota-virtualbox:~/SYCL sample code$ /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/bin/compute++ -sycl-driver -sycl-target spirv64 hello_world.cpp -I/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/ -L/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/lib -lComputeCpp -I/home/tokyo/computecpp-sdk-master/include -I/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/OpenCL-Headers -o hello_sycl

Try this:

Edit the hello_world.cpp, change the “default_selector” by “cpu_selector”, something like that:

cl::sycl::cpu_selector device_selector;

and then compile again using: -sycl-target spirv64
and omit for now the: -lpocl
and let’s see what happens!

Hi. Thank you for your comment.

I changed to

cl::sycl::cpu_selector device_selector;

However, there is no change.

My command and result are,

tokyo@ryota-virtualbox:~/SYCL sample code$ /home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/bin/compute++ -sycl-driver -sycl-target spirv64 hello_world.cpp -I/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/include/ -lOpenCL -L/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/lib -lComputeCpp -I/home/tokyo/computecpp-sdk-master/include -I/home/tokyo/ComputeCpp-CE
remark: [Computecpp:CC0027]: Some memcpy/memset intrinsics added by the llvm optimizer were replaced by serial functions. This is a
workaround for OpenCL drivers that do not support those intrinsics. This may impact performance, consider using
-no-serial-memop. [-Rsycl-serial-memop]
tokyo@ryota-virtualbox:~/SYCL sample code$ ./hello_sycl
Running on Intel(R) Core™ i5-1035G7 CPU @ 1.20GHz
terminate called after throwing an instance of ‘cl::sycl::compile_program_error’
Aborted (core dumped)

Ummm…

I don’t know what is wrong.

Did you try to run the “computecpp_info” (i don’t remember the exact name? But you can find it inside of the bin folder. Right here:

/home/tokyo/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/bin

Run it and show us the output

It’s result of ./computecpp_info.

tokyo@ryota-virtualbox:~/ComputeCpp-CE-1.3.0-Ubuntu-16.04-x86_64/bin$ ./computecpp_info


ComputeCpp Info (CE 1.3.0)

SYCL 1.2.1 revision 3


Toolchain information:

GLIBC version: 2.27
GLIBCXX: 20160609
This version of libstdc++ is supported.


Device Info:

Discovered 1 devices matching:
platform :
device type :


Device 0:

Device is supported : UNTESTED - Untested OS
Bitcode targets : spir64 spirv64
CL_DEVICE_NAME : Intel® Core™ i5-1035G7 CPU @ 1.20GHz
CL_DEVICE_VENDOR : Intel® Corporation
CL_DRIVER_VERSION : 18.1.0.0920
CL_DEVICE_TYPE : CL_DEVICE_TYPE_CPU

If you encounter problems when using any of these OpenCL devices, please consult
this website for known issues:
https://computecpp.codeplay.com/releases/v1.3.0/platform-support-notes


I think it might be worth trying the SPIR 1.2 bitcode type rather than SPIR-V. That driver version looks quite old.

This is unlikely to change much, as well, but ComputeCpp 2.0.0 has been released. Additionally, it looks like your sample doesn’t have any error handling - the parallel-for sample from the SDK generally will catch and print any errors that happen during execution with a full build log from the OpenCL implementation.

Hi. Thank you for your comment.
I just ran “parallel-for” in my Ubuntu terminal.
The result are as follows,

tokyo@ryota-virtualbox:~/computecpp-sdk-master/build/samples$ ./parallel-for
The results are correct.

Now I’m trying to change SPIR-V to SPIR 1.2 with information from this site. https://github.com/KhronosGroup/SPIR/tree/spir_12https://www.khronos.org/spir/

Hi,
It’s the driver that implements the SPIR support, so if you are looking to use SPIR-V you need to install a driver that is able to understand SPIR-V instructions. The Intel driver (v18.1.0.0920) is old, you will need to install drivers that are newer and support SPIR-V. What ultimately are you trying to do? This would help us to guide you.
Edit: It looks like you are using the POCL driver but computecpp_info is suggesting you are using the Intel one, I’m not sure why that is the case, perhaps you have both installed. I am told that POCL does not currently support SPIR-V so that would be why you need to use SPIR instructions.

Hi Rod. Thank you for your advice.

Yes, I installed both of intel and POCL.
(Last time when I ran computecpp_info, activated icd file was intel.)
I’m controlling icd files by putting “.bak”.

tokyo@ryota-virtualbox:/etc/OpenCL/vendors$ ls
intel64.icd.bak nvidia.icd.bak pocl.icd

My customer is using POCL and they want to make SYCL environment on it.
So I was trying to install SYCL1.2 with following this site instruction, however it was killed by memory shortage.

Anyway I understand as follows,

  1. Latest version of intel OpenCL driver and SPIR-V are required for running the sample code.

  2. If I want to use POCL, SPIR1.2 is required for runnning the sample code.

Could you tell me downloading site of latest Intel OpenCL driver?
Is this web site, and which one?
https://software.intel.com/content/www/us/en/develop/articles/opencl-drivers.html#cpu-section

You can use our Platform Support page to see what minimum driver versions are required from Intel. This is currently 1.2.0.25 for CPU and 19.13.12717 for GPU. Also check the notes on that page too.

For POCL, there is “expermental” support for SPIR and “even more experimental” support for SPIR-V. This page has some limited information. Unfortunately I have never tried POCL with ComputeCpp so I can’t offer too much advice.