Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] SOCL

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] SOCL


Chronologique Discussions 
  • From: Sylvain HENRY <sylvain.henry@inria.fr>
  • To: Denis Barthou <denis.barthou@inria.fr>
  • Cc: Emmanuel Jeannot <emmanuel.jeannot@labri.fr>, raymond.namyst@inria.fr, starpu-devel@lists.gforge.inria.fr, alexandre.denis@inria.fr
  • Subject: Re: [Starpu-devel] SOCL
  • Date: Thu, 18 Nov 2010 20:39:38 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Le 18/11/2010 19:32, Denis Barthou a écrit :
But what I meant by my previous question was that: instead of choosing a priori a target device for your copy (before StarPU decision),  can you make a copy that does not depend on where to copy data to/from ? Once StarPU as decided where to map this copy task, it will achieve the copy to/from the device it is mapped (after StarPU decision). It means that the copy task will first look up at the device where it is executed and then do the copy. This would let StarPU responsible for where to map the copy task (therefore, where to perform the copy).
This is what is currently done. The copy task performs a memcpy or a clEnqueueWriteBuffer depending on where StarPU schedules it.
As the copy depend on the architecture where it is mapped, it could also checked whether the data is already available on the device, potentially performing a no-op. Basically, copies are just explicit "prefetches".
Data *cannot* be already on the device (i.e. StarPU virtual device), it's just a pointer to host memory (not managed by StarPU). We have to perform a copy at some point to put it in StarPU managed virtual memory.

The only case (that I can think of) where we might avoid data transfer is for asynchronous "write" of the whole buffer. A temporary StarPU buffer could be registered in-place. It requires that no blocking "wait" is performed on the event returned by the copy (or any event of a command depending on it) before the buffer is released or before another copy of this kind is performed. If we detect a "wait" or a clGetEventInfo(EXECUTION_STATUS) we would have to wait for tasks that use the temporary buffer and then to perform the copy (to avoid deadlocks).


Now, if this is possible, due to the affinity between this copy tasks and (possible) computational tasks that depend on it, they will be mapped to the same device. Moreover, the choice of the target device (depending on scheduling algo in StarPU) may depend on the latency of the computational task + latency of the copy.

 
o select where to copy, maybe we could give a mark to each device depending on the number of tasks already requiring this buffer (if any) and depending on already available buffers for this task on the device.

The following could be an algorithm to select the target device:


I don't think it's a good idea since it does not depend on performance modeling of the tasks. At some point, even if the data is not available on some device, it may be more beneficial to map tasks depending on it for performance reasons (not even saying load balancing).

Right. We could get best devices for each task using enabled scheduling policy. The algorithm is then: which device is the most present in { forall task requiring the buffer, get the device on which they would be scheduled using the actual policy }
Anyway, it was just an example. ;-)

Cheers
Sylvain



Archives gérées par MHonArc 2.6.19+.

Haut de le page