Objet : Developers list for StarPU
Archives de la liste
- From: Mawussi Zounon <mawussi.zounon@manchester.ac.uk>
- To: Samuel Thibault <samuel.thibault@inria.fr>
- Cc: "starpu-devel@lists.gforge.inria.fr" <starpu-devel@lists.gforge.inria.fr>
- Subject: Re: [Starpu-devel] Task re-execution with StarPU
- Date: Wed, 2 Aug 2017 18:59:39 +0000
- Authentication-results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=mawussi.zounon@manchester.ac.uk; spf=Pass smtp.mailfrom=mawussi.zounon@manchester.ac.uk; spf=None smtp.helo=postmaster@probity.mcc.ac.uk
- Ironport-phdr: 9a23:XvOchx/HD+zErP9uRHKM819IXTAuvvDOBiVQ1KB+0+sUIJqq85mqBkHD//Il1AaPBtSLraocw8Pt8InYEVQa5piAtH1QOLdtbDQizfssogo7HcSeAlf6JvO5JwYzHcBFSUM3tyrjaRsdF8nxfUDdrWOv5jAOBBr/KRB1JuPoEYLOksi7ze6/9pnQbglSmDaxfa55IQmrownWqsQYm5ZpJLwryhvOrHtIeuBWyn1tKFmOgRvy5dq+8YB6/ShItP0v68BPUaPhf6QlVrNYFygpM3o05MLwqxbOSxaE62YGXWUXlhpIBBXF7A3/U5zsvCb2qvZx1S+HNsDtU7s6RSqt4LtqSB/wiScIKTg58H3MisdtiK5XuQ+tqwBjz4LRZoyeKfhwcb7Hfd4CRWRPQNtfVzBPDI2/YYsADesBMvpXoYbyvFYOsQK+CA2wCO/z1jNEm3n71rA63eQ7FgHG2RQtEdUUv3XasdX1L7wSUeGox6bLyjXDcvVW2TD76IPVdR0hpOuDXLxtccrK0kkvFhnJj1SXqYz4OTOV0eINvnOe7+V6U+Kgl24npB9qojiz2MgskJPFiZgJxVze6CV5w584KNulQ0B4ed6pCIZcui+GO4dsQc4vQHtktDgmxrEao5K2eCcHxIw6yxLDcfCLbZKE7g/hWeqPOzt1i3BodKqiixuz9UWs0PPwW8m73VpQoCdJiMfAu38J2hHV98OJUOFy/l271jaKzw3T6v9LIUQzlafDM54h2LkwmYESsUTfHi76gkD2g7OKeUk+4Oen9/7rYqjlppOENo90jB/xMrg2l8CiD+k0LhICUmuY9OimyrHv4VH1TK9Og/A5iqXZtYrVJcUfpq63GQ9V1YMj5g66Dzi83tUYgGIHLElEeB6djonpPEzOIPb/Dfe5mFmslS1kx/baMb3hHJrNNWTDn6n7fbpn8UFT1BA/zc1c555MELEOPOrzWlPttNzfFhI5KBC0zPz9CNVly4MSQH+ADbGHMKzMtV+F/eYvI+iXZI8JozbxMfcl5/DpjX8jll8SY7Ol3ZoRaHCiH/RpOV+VYXT2gt0ZC2cFohI+TPD2iF2FSTNTe3ayX6Mn6T4hFYKmCZvDS5usgbyA2Se0BYdWaXpcBlCNF3fobYSEVO0WZCKcOM9riiYEWqS5S489yRGusxf3xKB6LurR4CIYuozs2cN05u3SkBE97iZ0AN6H32GMSWF0hGIISCUs0KBxu0wugmuEhJNxhuFVEZRv5/JDWxo+KdaI1OV/Ft32HB7Bf92AVVK6atSgGzA4CNwrlYwgeUF4TvCrkQ3YxCviMbgIkbuNTMgx+73BxGPwYdR61nDB0oEkhl8tQsZUKWC8wKd0sRXQUd2a236FnrqnIPxPlBXG832OmC/X5BlV
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Hi Samuel,
Thanks for the quick reply.
My question is not accurate but you make it to understand my problem,
with a good example. I will read more about the tags and see how
to exploit them. Once again thanks, the response is very helpful.
Best regards,
--Mawussi
________________________________________
From: Samuel Thibault [samuel.thibault@inria.fr]
Sent: Wednesday, August 02, 2017 5:59 PM
To: Mawussi Zounon
Cc: starpu-devel@lists.gforge.inria.fr; Olivier Aumage
Subject: Re: Task re-execution with StarPU
Hello,
Mawussi Zounon, on mer. 02 août 2017 13:04:31 +0000, wrote:
> During the execution, halfway in the DAG, a task may be corrupted,
> i.e., produce invalid result for some reasons.
> 1. With StarPU how can a node traverse backward a part of the DAG?
It can not, StarPU does not keep a record of completed tasks, that
wouldn't scale. It's up to the application to go back in its submission
loops.
Okay.
> 2. Is it possible for a node to insert subtasks during the computation?
A node can submit whatever tasks it wants during the execution :)
One thing which is not simple, however, is inserting at task *within* an
already-submitted DAG. Appending tasks is not a problem, but stuffing
tasks in the middle of a string of tasks modifying a piece of data, for
instance, can not be done trivially. For this, one has to use StarPU
tags, which act as a rendez-vous points. This is also needed because
whatever tasks you submit will depend on the last submitted tasks which
have touched the data, which are probably much later in your DAG, which
is not what you want: you want to get the task to start immediately, to
replace the bogus task. So you need to use tags to detach that part of
the DAG, and be able to plug your tasks in there, independently from the
rest of the DAG.
I guess (I don't have enough details on your exact use case to be sure)
a way would be something like this:
A->Bperhapsbogus->C
where C uses a tagC which depends on a tagCorrect. If Bperhapsbogus
computed rightly, you release tagCorrect. Otherwise, you submit another
task Bperhapsbogus with sequential_consistency set to 0 (to avoid
getting appended after the DAG), and perhaps that one will release
tagCorrect, otherwise submit yet another, etc.
Concerning having to go back in the DAG, as I said you have to do it in
the application, basically up to the first read-only dependency. You may
might want to introduce a tasks whose only purpose is to make a copy of
the input to a data which will be written to by B, so you have a copy
from which to restart if B gets bogus.
> 3. How to inform the neighbouring nodes on the new subtasks?
I don't think you want to go that way. The principle of StarPU
distributed on MPI is that there is no exchange of information on the
DAG, to make things simple and scalable.
And I don't think you actually need to inform the neighbouring nodes
about the new subtasks, you can just introduce an intermediate cl==NULL
task which thus does nothing but provide the well-known synchronization
point between the MPI nodes.
> Please fill free to send me any useful references or links.
Well, to my knowledge you are the first one trying to do things like
this :)
(about tags, there is of course the StarPU manual, and examples in the
tree).
Samuel
- [Starpu-devel] Task re-execution with StarPU, Mawussi Zounon, 02/08/2017
- Re: [Starpu-devel] Task re-execution with StarPU, Samuel Thibault, 02/08/2017
- Re: [Starpu-devel] Task re-execution with StarPU, Mawussi Zounon, 02/08/2017
- Re: [Starpu-devel] Task re-execution with StarPU, Samuel Thibault, 02/08/2017
- Re: [Starpu-devel] Task re-execution with StarPU, Mawussi Zounon, 02/08/2017
- Re: [Starpu-devel] Task re-execution with StarPU, Samuel Thibault, 02/08/2017
Archives gérées par MHonArc 2.6.19+.