Objet : Developers list for StarPU
Archives de la liste
- From: Cédric Augonnet <cedric.augonnet@inria.fr>
- To: Sylvain HENRY <sylvain.henry@inria.fr>
- Cc: Emmanuel Jeannot <emmanuel.jeannot@labri.fr>, Denis Barthou <denis.barthou@inria.fr>, starpu-devel@lists.gforge.inria.fr, raymond.namyst@inria.fr, alexandre.denis@inria.fr
- Subject: Re: [Starpu-devel] SOCL
- Date: Sun, 14 Nov 2010 10:54:07 +0100
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Hi Sylvain,
This is not a full answer, but just a few questions to help us understand your issues a little better...
We are current looking how to add an efficient starp_data_cpy function. This function copies the content of a handle into another, and a callback is possibly executed when the transfer is done. We still use a task internally, but we'll look how to do that better in the future.
Current implementation: SOCL uses a fake StarPU computation task to schedule data transfers. This fake task uses memcpy or blocking clEnqueueRead/WriteBuffer to copy data. This has several drawbacks:
- computing devices are considered busy while they are in fact waiting for DMA transfer to complete
- data transfer may not be optimal (DMA where memcpy could have been used, etc.)
- StarPU's execution/transfer traces are wrongReport this issue then ... do you have a specific example?
2) Buffer map/unmapIsn't that what the starpu_data_acquire function does ? (you have an asynchronous function that does similar things, starpu_data_acquire_cb). If not, how does it differ from your needs ?
OpenCL can schedule buffer mapping/unmapping anytime in the command graph.
StarPU can only map/unmap synchronously (i.e. the mapping command cannot have dependencies on other commands).
Current implementation: SOCL uses a fake StarPU CPU computation task to schedule the map/unmap commands. The fake task has the buffer to map as an input to force data transfer in host memory and to make starpu_data_acquire (map command) non blocking.
We are looking how to change the codelet interface so that we can express more constraints that with the current "where" field. For instance we would certainly add constraints on the available memory, or the availability of double precision, or specify that the task may only execute on a subset of workers.
3) Kernel compilation & scheduling
OpenCL kernels may not be portable.
StarPU assumes that every OpenCL kernel can be executed on any OpenCL device.
Current implementation: we suppose that every OpenCL kernel can be executed on every OpenCL device, even if it's wrong.
[..]
What is left to do:We could implement some starpu_data_cpy_{to,from}_interface functions which take a handle and copy its content into or from an interface provided by the application (that is not attached to a handle). Would something like that be useful to you ?
1) Manage data transfers from/to host memory and between buffers
Scheduling of these commands is easily done with events/triggers. However performing data transfers correctly is still bogus.
This implies modifications in the way StarPU manages data requests and data transfers. (cf "DataWizard" code)
int foo[1024];
struct vector_interface vector = {.nx = 1024, .ptr = foo, .elemsize = sizeof(int)};
starpu_data_cpy_to_interface(handle, &vector);
2) Kernel compilation and executionThat would certainly be part of the per-codelet constraints that i mentioned earlier.
2.1) OpenCL compilers can be slow (NVIDIA...). We may choose not to compile every kernel for every device (what is currently done).
2.2) Some kernels may not be executed on some devices.
Short term solution: we may try to compile kernels on devices. If compilation fails, we need a way to exclude failing devices from the list of devices on which the kernel may be scheduled.
3.4) Thread-safety: check StarPU thread-safety.You mean: check OpenCL thread-safety safety :) We should detect what is non-thread safe in OpenCL and add locks around these methods in the OpenCL driver (we did that for the init phase already).
Best,
Cédric
- [Starpu-devel] SOCL, Sylvain HENRY, 09/11/2010
- <Suite(s) possible(s)>
- Re: [Starpu-devel] SOCL, Cédric Augonnet, 14/11/2010
- Re: [Starpu-devel] SOCL, Sylvain HENRY, 15/11/2010
- Re: [Starpu-devel] SOCL, Denis Barthou, 17/11/2010
- Re: [Starpu-devel] SOCL, Sylvain HENRY, 17/11/2010
- Re: [Starpu-devel] SOCL, Denis Barthou, 18/11/2010
- Re: [Starpu-devel] SOCL, Sylvain HENRY, 18/11/2010
- Re: [Starpu-devel] SOCL, Denis Barthou, 18/11/2010
- Re: [Starpu-devel] SOCL, Sylvain HENRY, 18/11/2010
Archives gérées par MHonArc 2.6.19+.