Objet : Developers list for StarPU
Archives de la liste
- From: Xavier Lacoste <xl64100@gmail.com>
- To: Samuel Thibault <samuel.thibault@ens-lyon.org>
- Cc: starpu-devel@lists.gforge.inria.fr
- Subject: Re: [Starpu-devel] Assert : Number of copy requests left is not zero
- Date: Tue, 4 Nov 2014 11:18:01 +0100
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Le 4 nov. 2014 à 11:11, Samuel Thibault <samuel.thibault@ens-lyon.org> a
écrit :
> Xavier Lacoste, le Tue 04 Nov 2014 10:49:52 +0100, a écrit :
>>>> [starpu][_starpu_mpi_early_data_check_termination][assert failure]
>>>> Number of copy requests left is not zero
>>>>
>>>> Have you get an idea of what could cause this assert ?
>>>
>>> I have added a note in the message: did you forget to post a receive
>>> corresponding to a send?
>>
>> Hmm I'll have a look at it.
>> Can it be that I flush a data earlier than it should be ?
>
> Ah, I'm realizing: you are not using starpu_mpi_send, starpu_mpi_recv
> and alike explicitly, and always rely on the communications implicitly
> generated by starpu_mpi_task_insert?
Yes, indeed.
>
> Normally, if all MPI nodes are doing exactly the same submission loop,
> flushing data earlier is not a problem: the node needing it later will
> receive it again, and the node owning it will know that the node needing
> it later will need it, and thus sending it again.
I'm not doing the same submission loop on all MPI nodes.
I'm inserting only tasks using local data (as I tried to explain in the
algorithms in my previous mail).
Thus a mistake is possible here in my side.
> Perhaps there is a
> bug in StarPU-MPI, but we already test this scenario, so I'd rather
> first make sure that the application is really running the same
> submission loop first (perhaps you have made mistakes while pruning the
> submission, and thus the node owning the data doesn't know it has to
> send it again, or conversely).
Yes, I'll check my submission loops.
>
> Samuel
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail
- [Starpu-devel] Assert : Number of copy requests left is not zero, Xavier Lacoste, 04/11/2014
- Re: [Starpu-devel] Assert : Number of copy requests left is not zero, Xavier Lacoste, 04/11/2014
- Re: [Starpu-devel] Assert : Number of copy requests left is not zero, Samuel Thibault, 04/11/2014
- Re: [Starpu-devel] Assert : Number of copy requests left is not zero, Xavier Lacoste, 04/11/2014
- Re: [Starpu-devel] Assert : Number of copy requests left is not zero, Samuel Thibault, 04/11/2014
- Re: [Starpu-devel] Assert : Number of copy requests left is not zero, Xavier Lacoste, 04/11/2014
- Re: [Starpu-devel] Assert : Number of copy requests left is not zero, Xavier Lacoste, 04/11/2014
- Re: [Starpu-devel] Assert : Number of copy requests left is not zero, Xavier Lacoste, 04/11/2014
- Re: [Starpu-devel] Assert : Number of copy requests left is not zero, Samuel Thibault, 04/11/2014
- Re: [Starpu-devel] Assert : Number of copy requests left is not zero, Xavier Lacoste, 04/11/2014
Archives gérées par MHonArc 2.6.19+.