SYCL

作用

SYCL是一個免版稅的跨平台抽象層，基於OpenCL的基本概念，可移植性和效率，使得異構處理器的代碼可以使用完全標準的“單一來源”風格編寫C ++。 SYCL支持單一原始碼開發，其中C ++模板函式可以包含主機代碼和設備代碼，以構建使用OpenCL加速的複雜算法，然後在不同類型數據的原始碼中重複使用它們。

雖然最初是為了與OpenCL和SPIR一起使用而開發的，但它實際上是一個能夠以其他系統為目標的更一般的異構框架。

版本

最新版本是於2018年6月21日發布的SYCL 1.2.1修訂版2（第一版於2017年12月6日發布）。

2014年3月，GCL在臨時版本1.2 中引入了SYCL，隨後於2015年5月在IWOCL 2015上引入了SYCL 1.2最終版本。

2016年5月在IWOCL 2016上推出了SYCL 2.2暫定。

公共版本是：

a.針對OpenCL 1.2硬體特性的SYCL 1.2.1，具有OpenCL 1.2互操作性模式;

b.以OpenCL 2.2互操作性模式為目標的針對OpenCL 2.2硬體特性的臨時SYCL 2.2。

例子

以下示例顯示了定義在默認加速器上運行的3個核心的隱式任務圖的單源純C ++編程模型。

#include <CL/sycl.hpp>#include <iostream>using namespace cl::sycl;// Size of the matricesconstexpr size_t N = 2000;constexpr size_t M = 3000;int main() {  // Create a queue to work on default device  queue q;  // Create some 2D buffers with N×M float values for our matrices  buffer<double, 2> a{{ N, M }};  buffer<double, 2> b{{ N, M }};  buffer<double, 2> c{{ N, M }};  // Launch a first asynchronous kernel to initialize buffer "a"  q.submit([&](auto &cgh) {      // The kernel write "a", so get a write accessor on it      auto A = a.get_access<access::mode::write>(cgh);      // Enqueue parallel kernel on an N×M 2D iteration space      cgh.parallel_for<class init_a>({ N, M },                         [=] (auto index) {                           A[index] = index[0]*2 + index[1];                         });    });  // Launch an asynchronous kernel to initialize buffer "b"  q.submit([&](auto &cgh) {      // The kernel write "b", so get a write accessor on it      auto B = b.get_access<access::mode::write>(cgh);      // Enqueue a parallel kernel on an N×M 2D iteration space      cgh.parallel_for<class init_b>({ N, M },                         [=] (auto index) {                           B[index] = index[0]*2014 + index[1]*42;                         });    });  // Launch an asynchronous kernel to compute matrix addition c = a + b  q.submit([&](auto &cgh) {      // In the kernel "a" and "b" are read, but "c" is written      // Since the kernel reads "a" and "b", the runtime will add implicitly      // a producer-consumer dependency to the previous kernels producing them.      auto A = a.get_access<access::mode::read>(cgh);      auto B = b.get_access<access::mode::read>(cgh);      auto C = c.get_access<access::mode::write>(cgh);      // Enqueue a parallel kernel on an N×M 2D iteration space      cgh.parallel_for<class matrix_add>({ N, M },                                     [=] (auto index) {                                       C[index] = A[index] + B[index];                                     });    });  /* Request an access to read "c" from the host-side. The SYCL runtime     will wait for "c" to be ready available on the host side before     returning the accessor.     This means that there is no communication happening in the loop nest below.  */  auto C = c.get_access<access::mode::read>();  std::cout << std::endl << "Result:" << std::endl;  for (size_t i = 0; i < N; i++)    for (size_t j = 0; j < M; j++)      // Compare the result to the analytic value      if (C[i][j] != i*(2 + 2014) + j*(1 + 42)) {        std::cout << "Wrong value " << C[i][j] << " on element "                  << i << ' ' << j << std::endl;        exit(-1);      }  std::cout << "Good computation!" << std::endl;  return 0;}

與CUDA進行比較

開放標準SYCL和OpenCL與Nvidia的特定於供應商的CUDA類似。

在Khronos Group領域，OpenCL是低級非單一源API，SYCL是高級單一源C ++領域特定的嵌入式語言。

相比之下，實際命名為“CUDA運行時API”的CUDA的單一源C ++領域特定嵌入式語言版本在某種程度上與SYCL類似。但實際上有一個不太為人所知的非單一原始碼版本的CUDA，稱為“CUDA Driver API”，與OpenCL類似，並且由CUDA運行時API實現本身使用。

SYCL

基本介紹

作用

版本

例子

與CUDA進行比較

相關詞條

熱門詞條