Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] Strange behaviour using GPUs

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] Strange behaviour using GPUs


Chronologique Discussions 
  • From: Xavier Lacoste <xavier.lacoste@inria.fr>
  • To: Nathalie Furmento <nathalie.furmento@labri.fr>
  • Cc: Mathieu Faverge <Mathieu.Faverge@inria.fr>, starpu-devel@lists.gforge.inria.fr, Pierre Ramet <ramet@labri.fr>
  • Subject: Re: [Starpu-devel] Strange behaviour using GPUs
  • Date: Wed, 10 Jul 2013 09:43:45 +0200
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hello,

Indeed, the behaviour is really better with r10552, I still loose some time
when I use 9CPUs + 3 GPUs instead of 10+2 but not too much (84.2 s vs 62.4 s
whereas I had 750 s before) ...

Thanks,

XL.


Le 9 juil. 2013 à 15:54, Xavier Lacoste a écrit :

> Thanks Nathalie,
>
> I'll try this when I'll happen to be scheduled on a GPU node... (plafrim
> scheduler is quite strange today...)
>
> XL.
>
> Le 9 juil. 2013 à 14:38, Nathalie Furmento a écrit :
>
>> Xavier,
>>
>> There was indeed a bug in 1.1.0rc1 which made dmda queues LIFOs . This has
>> been fixed in the branch. We actually found out by seeing MAGMA
>> performances dropping down.
>>
>> Could you please try and let us know if that also fixes your problem? I am
>> planning to release 1.1.0rc2 tomorrow, but you could also try by checking
>> out the svn repository.
>>
>> Thanks,
>>
>> Nathalie
>>
>> in On 09/07/2013 14:19, Xavier Lacoste wrote:
>>> Hello,
>>>
>>> I tried to compare ParSEC and StarPU using full machines (mirage) on
>>> PaStiX.
>>> We achieve to have good scaling with ParSEC by statically scheduling task
>>> on GPUs.
>>> So, I tried to reproduce the same scheduling with StarPU 1.1.0rc1 and
>>> obtained strange behavious.
>>> My GPUs take all there time FetchingInput and PushingOutputs.
>>>
>>> The output trace can be seen here :
>>> http://img401.imageshack.us/img401/5213/1pb.png
>>> Or, on plafrim cluster :
>>> 9 CPUs + 3 GPUs
>>> /lustre/lacoste/rsync/log_mirage_AUDI_dmda+LLT+flop+cmin20+frat8+distFlop+selfCopy+fxt_9_3_4231_20130709_120332_crit2_0.5GPUmem.trace
>>> 10 + 2 :
>>> /lustre/lacoste/rsync/log_mirage_AUDI_dmda+LLT+flop+cmin20+frat8+distFlop+selfCopy+fxt_10_2_4231_20130709_120332_crit2_0.5GPUmem.trace
>>> 11 + 1 :
>>> /lustre/lacoste/rsync/log_mirage_AUDI_dmda+LLT+flop+cmin20+frat8+distFlop+selfCopy+fxt_11_1_4231_20130709_120332_crit2_0.5GPUmem.trace
>>> 12 + 0 :
>>> /lustre/lacoste/rsync/log_mirage_AUDI_dmda+LLT+flop+cmin20+frat8+distFlop+selfCopy+fxt_12_0_4231_20130709_120332_crit2_0.5GPUmem.trace
>>>
>>> The results are good with 12 CPUs + 0, adding 1 or 2 GPUs bring no loss,
>>> but no gain either, adding three slows down the run time by a factor 10.
>>>
>>> Do you have any idea of what can explain this behaviour ?
>>>
>>> With a dynamic scheduling, GPUs are usefull with a small number of CPUs
>>> but not with the whole machine.
>>> (My cost model may not be accurate enough for the moment...
>>> http://img515.imageshack.us/img515/2348/njz.png)
>>>
>>> Thanks,
>>>
>>> XL.
>>> _______________________________________________
>>> Starpu-devel mailing list
>>> Starpu-devel@lists.gforge.inria.fr
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel
>>
>
>
> _______________________________________________
> Starpu-devel mailing list
> Starpu-devel@lists.gforge.inria.fr
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel






Archives gérées par MHonArc 2.6.19+.

Haut de le page