Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] automatic RAM allocation and CUDA worker issue

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] automatic RAM allocation and CUDA worker issue


Chronologique Discussions 
  • From: Olivier Aumage <olivier.aumage@inria.fr>
  • To: Kevin Juilly <kevin.juilly@eolen.com>
  • Cc: starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] automatic RAM allocation and CUDA worker issue
  • Date: Thu, 30 Nov 2017 11:33:17 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hello,

Thanks for the bug report. Revisions r22589 for the "starpu-1.2/" branch and
r22590 for the "trunk/" branch should fix the problem.

Best regards,
--
Olivier

> Le 13 nov. 2017 à 14:28, Kevin Juilly <kevin.juilly@eolen.com> a écrit :
>
> Hello,
>
> On a node with a GPU, if a program asks for more memory than the GPU got,
> the data are registered with -1 as home_node and all but the GPU worker
> are disabled, StarPU 1.2 will abort on an assert (see assert.log) even if
> the need of any task is well under the size of the GPU memory.
>
> The problem doesn't seem to occur when STARPU_PREFETCH=0.
>
> A reproducer is attached. This code allocate a lot of square matrices and
> start task on them. The tasks themselves do not do any work and are only
> cuda_func, their only purpose is to force memory management to occur on the
> GPU
>
> The program takes two optionnal parameters:
> - the size of the matrices
> - the number of matrices
> When no argument are given, it will try to allocate enough matrices to use
> 3 times the size of the GPU memory. The assertion doesn't occur when not
> allocating enough, even if it is more than the size of the GPU memory.
>
> This reproduces the memory behaviour of the test case that triggered the
> bug.
>
> Also attached you'll find : a config.log extract and the list of StarPU
> environment variables used.
>
>
> As a note, the same case produced incorrect behaviour with StarPU 1.1. In
> this case the worker was stuck (it seems) in an infinite loop inside
> _starpu_fetch_task_input (calling function to try to free memory). I
> haven't been able to reproduce it recently and have no idea why.
>
> Regards,
> Kevin Juilly
> AS+ groupe Eolen
> <assert.log><alloc_assert.c><extract_config.log><env_repro.txt>_______________________________________________
> Starpu-devel mailing list
> Starpu-devel@lists.gforge.inria.fr
> https://lists.gforge.inria.fr/mailman/listinfo/starpu-devel





Archives gérées par MHonArc 2.6.19+.

Haut de le page