Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] StarPU on multiple communicator

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] StarPU on multiple communicator


Chronologique Discussions 
  • From: Xavier Lacoste <xavier.lacoste@inria.fr>
  • To: Nathalie Furmento <nathalie.furmento@labri.fr>
  • Cc: Mathieu Faverge <Mathieu.Faverge@inria.fr>, starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] StarPU on multiple communicator
  • Date: Tue, 27 Jan 2015 17:07:28 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Sorry,

I didn't see your answer,

Yes this is what I want to do.

Can I see your code to try to understand what is wrong in mine ?

Cheers,

XL


Le 26 janv. 2015 à 11:59, Nathalie Furmento <nathalie.furmento@labri.fr> a
écrit :

> I also tested with calling starpu_mpi_insert_task. Again it works fine with
> StarPU 1.1 but fails with the trunk.
>
> The data is owned by the node 0 of each sub communicator. The call
>
> starpu_mpi_insert_task(newcomm, &mycodelet,
> STARPU_RW, data,
> STARPU_EXECUTE_ON_NODE, 1,
> 0);
>
> results in the execution of the codelet on the processes having a rank 1 in
> each sub-communicator, which is if i understood correctly what you would
> like to have in your application.
>
> Cheers,
>
> Nathalie
>
> On 26/01/2015 11:35, Nathalie Furmento wrote:
>> Hello,
>>
>> I tried a simple application which splits in 2 communicators a group of 4
>> MPI processes,
>> and a communication is made between the rank 0 and the rank 1 of each new
>> sub communicator.
>>
>> This works fine with the branch 1.1 of StarPU. However it fails with the
>> trunk version of
>> StarPU. Its new communication engine assumes MPI_COMM_WORLD to be the
>> communicator for all
>> communications. I am going to have a look at how it can be fixed. I will
>> keep you informed.
>>
>> Cheers,
>>
>> Nathalie
>>
>> On Jan 23, 16:40, Xavier Lacoste wrote:
>>> Hello,
>>>
>>> I am using PaStiX in Jorek (A Tokamak simulation code).
>>>
>>> Jorek uses sub-communicators to factorize different matrices on different
>>> communicators (i.e. P0 and P1 factorize mat1 while P2 and P3 factorize
>>> mat2). The two matrices have exactly the same shape, so the same shape
>>> block columns (my data unit in starpu), identified with the same tags,
>>> but these data are in fact different as they belong to different matrices
>>> (mat1 and mat2).
>>>
>>> As far as I understand this cannot work in StarPU today ( I saw Nathalie
>>> who agrees with me on that ;) ).
>>>
>>> P0 and P2 (resp. P1 and P3) will register their data with exactly the
>>> same TAG. Thus we have two different data with the same TAG which could
>>> be a problem.
>>>
>>> An other issue is that the rank of the data is set using the
>>> sub-communicator rank which may not be what is expected by StarPU.
>>>
>>> All this results in StarPU hanging in a deadlock, certainly waiting for
>>> communications...
>>>
>>> I hope I am not too confusing in my explanations.
>>>
>>> I'll try a hack in my code to fix that: use global rank when registering
>>> data and retrieve the number of communicators (here my communicators have
>>> the same size hopefully) and add an offset to the TAGS.
>>>
>>> I think supporting communicator in StarPU would be a great feature.
>>>
>>> Regards,
>>>
>>> XL.
>>>
>>>
>>> ----------------------------------------
>>> Xavier Lacoste
>>> xavier.lacoste@inria.fr
>>> INRIA Bordeaux Sud-Ouest
>>> 200, avenue de la Vieille Tour
>>> 33405 Talence Cedex
>>> Tél : +33 (0)5 24 57 40 69
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Starpu-devel mailing list
>>> Starpu-devel@lists.gforge.inria.fr
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel
>>
>





Archives gérées par MHonArc 2.6.19+.

Haut de le page