/ examples / sycl / README.md
README.md
 1  # llama.cpp/example/sycl
 2  
 3  This example program provides the tools for llama.cpp for SYCL on Intel GPU.
 4  
 5  ## Tool
 6  
 7  |Tool Name| Function|Status|
 8  |-|-|-|
 9  |llama-ls-sycl-device| List all SYCL devices with ID, compute capability, max work group size, ect.|Support|
10  
11  ### llama-ls-sycl-device
12  
13  List all SYCL devices with ID, compute capability, max work group size, ect.
14  
15  1. Build the llama.cpp for SYCL for all targets.
16  
17  2. Enable oneAPI running environment
18  
19  ```
20  source /opt/intel/oneapi/setvars.sh
21  ```
22  
23  3. Execute
24  
25  ```
26  ./build/bin/llama-ls-sycl-device
27  ```
28  
29  Check the ID in startup log, like:
30  
31  ```
32  found 4 SYCL devices:
33    Device 0: Intel(R) Arc(TM) A770 Graphics,	compute capability 1.3,
34      max compute_units 512,	max work group size 1024,	max sub group size 32,	global mem size 16225243136
35    Device 1: Intel(R) FPGA Emulation Device,	compute capability 1.2,
36      max compute_units 24,	max work group size 67108864,	max sub group size 64,	global mem size 67065057280
37    Device 2: 13th Gen Intel(R) Core(TM) i7-13700K,	compute capability 3.0,
38      max compute_units 24,	max work group size 8192,	max sub group size 64,	global mem size 67065057280
39    Device 3: Intel(R) Arc(TM) A770 Graphics,	compute capability 3.0,
40      max compute_units 512,	max work group size 1024,	max sub group size 32,	global mem size 16225243136
41  
42  ```
43  
44  |Attribute|Note|
45  |-|-|
46  |compute capability 1.3|Level-zero running time, recommended |
47  |compute capability 3.0|OpenCL running time, slower than level-zero in most cases|