The OpenCL C programming language implements these functions that provide asynchronous copies between global and local memory and a prefetch from global memory.
Perform an async copy of num_gentypes
elements from
src
to dst
. The async copy is performed by all
work-items in a work-group and this built-in function must therefore be encountered
by all work-items in a work-group executing the kernel with the same argument values;
otherwise the results are undefined.
Returns an event object that can be used by
wait_group_events to
wait for the async copy to finish. The event
argument can also be
used to associate the async_work_group_copy
with a previous
async copy allowing an event to be shared by multiple async copies; otherwise
event
should be zero.
If event
argument is non-zero, the event object supplied in
event
argument will be returned.
This function does not perform any implicit synchronization of source data such as using a barrier before performing the copy.
The generic type name gentype indicates the built-in data types char, char{2|3|4|8|16}, uchar, uchar{2|3|4|8|16}, short, short{2|3|4|8|16}, ushort, ushort{2|3|4|8|16}, int, int{2|3|4|8|16}, uint, uint{2|3|4|8|16}, long, long{2|3|4|8|16}, ulong, ulong{2|3|4|8|16} or float, float{2|3|4|8|16} as the type for the arguments unless otherwise stated.
Optionally, generic type name gentype may indicate double and double{2|3|4|8|16} as arguments and return values. If extended with cl_khr_fp16, generic type name gentype may indicate half and half{2|3|4|8|16} as arguments and return values.
async_work_group_copy
and
async_work_group_strided_copy
for 3-component
vector types behave as async_work_group_copy
and
async_work_group_strided_copy
respectively for 4-component vector
types.