Objet : Developers list for StarPU
Archives de la liste
- From: Xavier Lacoste <xavier.lacoste@inria.fr>
- To: Nathalie Furmento <nathalie.furmento@labri.fr>
- Cc: Mathieu Faverge <Mathieu.Faverge@inria.fr>, starpu-devel@lists.gforge.inria.fr
- Subject: Re: [Starpu-devel] StarPU on multiple communicator
- Date: Mon, 2 Feb 2015 15:13:07 +0100
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Hello,
thanks, it worked with my simple example in PaStiX factorizing two simple
matrices (laplacian with n=1000) on two communicators.
But inside Jorek, I still trigger the watchdog after waiting more than 10.0
seconds (my tasks should be smaller than that).
I can have a stack with ddt but don't know what information is relevant in it.
Maybe I shouldn't use the watchdog ? I'll try the simple example with more
complex matrices to try to reproduce it.
Cheers,
XL.
Le 30 janv. 2015 à 13:51, Nathalie Furmento <nathalie.furmento@labri.fr> a
écrit :
> The functionality should now be fully supported in the branch 1.1.
>
> There is a small test in mpi/tests/comm.c
>
> Please let me know if it works for you.
>
> Cheers,
>
> Nathalie
>
> On 27/01/2015 17:07, Xavier Lacoste wrote:
>> Sorry,
>>
>> I didn't see your answer,
>>
>> Yes this is what I want to do.
>>
>> Can I see your code to try to understand what is wrong in mine ?
>>
>> Cheers,
>>
>> XL
>>
>>
>> Le 26 janv. 2015 à 11:59, Nathalie Furmento <nathalie.furmento@labri.fr> a
>> écrit :
>>
>>> I also tested with calling starpu_mpi_insert_task. Again it works fine
>>> with StarPU 1.1 but fails with the trunk.
>>>
>>> The data is owned by the node 0 of each sub communicator. The call
>>>
>>> starpu_mpi_insert_task(newcomm, &mycodelet,
>>> STARPU_RW, data,
>>> STARPU_EXECUTE_ON_NODE, 1,
>>> 0);
>>>
>>> results in the execution of the codelet on the processes having a rank 1
>>> in each sub-communicator, which is if i understood correctly what you
>>> would like to have in your application.
>>>
>>> Cheers,
>>>
>>> Nathalie
>>>
>>> On 26/01/2015 11:35, Nathalie Furmento wrote:
>>>> Hello,
>>>>
>>>> I tried a simple application which splits in 2 communicators a group of
>>>> 4 MPI processes,
>>>> and a communication is made between the rank 0 and the rank 1 of each
>>>> new sub communicator.
>>>>
>>>> This works fine with the branch 1.1 of StarPU. However it fails with the
>>>> trunk version of
>>>> StarPU. Its new communication engine assumes MPI_COMM_WORLD to be the
>>>> communicator for all
>>>> communications. I am going to have a look at how it can be fixed. I will
>>>> keep you informed.
>>>>
>>>> Cheers,
>>>>
>>>> Nathalie
>>>>
>>>> On Jan 23, 16:40, Xavier Lacoste wrote:
>>>>> Hello,
>>>>>
>>>>> I am using PaStiX in Jorek (A Tokamak simulation code).
>>>>>
>>>>> Jorek uses sub-communicators to factorize different matrices on
>>>>> different communicators (i.e. P0 and P1 factorize mat1 while P2 and P3
>>>>> factorize mat2). The two matrices have exactly the same shape, so the
>>>>> same shape block columns (my data unit in starpu), identified with the
>>>>> same tags, but these data are in fact different as they belong to
>>>>> different matrices (mat1 and mat2).
>>>>>
>>>>> As far as I understand this cannot work in StarPU today ( I saw
>>>>> Nathalie who agrees with me on that ;) ).
>>>>>
>>>>> P0 and P2 (resp. P1 and P3) will register their data with exactly the
>>>>> same TAG. Thus we have two different data with the same TAG which could
>>>>> be a problem.
>>>>>
>>>>> An other issue is that the rank of the data is set using the
>>>>> sub-communicator rank which may not be what is expected by StarPU.
>>>>>
>>>>> All this results in StarPU hanging in a deadlock, certainly waiting for
>>>>> communications...
>>>>>
>>>>> I hope I am not too confusing in my explanations.
>>>>>
>>>>> I'll try a hack in my code to fix that: use global rank when
>>>>> registering data and retrieve the number of communicators (here my
>>>>> communicators have the same size hopefully) and add an offset to the
>>>>> TAGS.
>>>>>
>>>>> I think supporting communicator in StarPU would be a great feature.
>>>>>
>>>>> Regards,
>>>>>
>>>>> XL.
>>>>>
>>>>>
>>>>> ----------------------------------------
>>>>> Xavier Lacoste
>>>>> xavier.lacoste@inria.fr
>>>>> INRIA Bordeaux Sud-Ouest
>>>>> 200, avenue de la Vieille Tour
>>>>> 33405 Talence Cedex
>>>>> Tél : +33 (0)5 24 57 40 69
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Starpu-devel mailing list
>>>>> Starpu-devel@lists.gforge.inria.fr
>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel
>
- Re: [Starpu-devel] StarPU on multiple communicator, Xavier Lacoste, 02/02/2015
- Re: [Starpu-devel] StarPU on multiple communicator, Xavier Lacoste, 02/02/2015
Archives gérées par MHonArc 2.6.19+.