Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] [trace interpretation]

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] [trace interpretation]


Chronologique Discussions 
  • From: Samuel Thibault <samuel.thibault@inria.fr>
  • To: Maxim Abalenkov <maxim.abalenkov@gmail.com>
  • Cc: starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] [trace interpretation]
  • Date: Mon, 6 Aug 2018 17:31:05 +0200
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
  • Organization: I am not organized

Hello,

Maxim Abalenkov, le lun. 06 août 2018 13:29:32 +0100, a ecrit:
> I have also ran the code through Valgrind. Please see Valgrind's output
> attached as well.

This is already an issue:

==22260== Invalid read of size 1
==22260== at 0x562A160: starpu_variable_data_register
(variable_interface.c:120)
==22260== by 0x4EDCCDC: plasma_zgetrf (zgetrf.c:186)

...

==22260== Address 0x1a74a0e0 is 0 bytes after a block of size 96 alloc'd
==22260== at 0x4C2CEDF: malloc (vg_replace_malloc.c:299)
==22260== by 0x4EDCB75: plasma_zgetrf (zgetrf.c:171)

You are apparently not properly declaring the size of the variable.

> I believe the problem might be with the reduction routine
> "core_starpu_dcabs1".
>
>     // set methods to define neutral elements, perform reduction operation
>     starpu_data_set_reduction_methods(
>             hp, &core_starpu_codelet_dcabs1_redux, &
> core_starpu_codelet_dcabs1_init);
>
>     // @test reset pivot values
>     core_starpu_dcabs1_init(hp, sched_ctx, prio);
>
> Since I call the reduction function multiple times throughout the program I
> would also like to "reset/reinitialise" the pivot's value "hp". Therefore, I
> explicitly call the "_init" routine that relies on the same codelet
> "core_starpu_codelet_dcabs1_init". Is this a legitimate thing to do? What
> would
> be a recommended approach?

I have to say I don't think I properly understand what you mean here,
and I'm afraid I can't say more than simple statements which may be
completely out of topic. You are not supposed to call functions directly
on data that was registered to StarPU since the data may have already
been put to a GPU, modified there, etc. If you really want to do such a
call, you need to call starpu_data_acquire(STARPU_W) before and
starpu_data_release() after. That will however of course wait for tasks,
which you should probably not do, so it'd be better to just submit a
task to do such reinitializations. You can however reuse the same
codelet structure as the reduction method.

> Another question I have is related to the application of contexts. In my
> programming scenario I dedicate a subset of threads called "mtpf" to perform
> tasks for the LU panel factorisation. The other larger group of threads is
> occupied with the other parts of the algorithm. Could you please take a
> look at
> my "skeleton" code to check that I'm using the contexts correctly?

As said in my other mail, I don't know contexts so much, so I can't say
for sure.

Samuel




Archives gérées par MHonArc 2.6.19+.

Haut de le page