Accéder au contenu.
Menu Sympa

starpu-devel - [Starpu-devel] Performance decreasing by adding empty tasks

Objet : Developers list for StarPU

Archives de la liste

[Starpu-devel] Performance decreasing by adding empty tasks


Chronologique Discussions 
  • From: Xavier Lacoste <xavier.lacoste@inria.fr>
  • To: starpu-devel@lists.gforge.inria.fr
  • Subject: [Starpu-devel] Performance decreasing by adding empty tasks
  • Date: Wed, 08 Feb 2012 15:26:27 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>


Hello,

I'm working on a StarPU based sparse factorisation.

I have 2 types of tasks :
- (1) Diagonal factorisation and update of the extra diagonal blocks
of the column block.
- (2) Multiply the transpose of a block by the block and all the
blocks bellow in the column block and update the column block facing the
current block.

In order to use GPU i added a new SPARSE_GEMM operation that perform
this products at once and update the facing column block with the
correct offsets.

To use this kernel, I need to compute the offsets.
Thus I added a CPU task which compute this offset (this can't be done on
GPUs).
There is one handle for each offset array and each task (2) depend on
one of this building offset task.
The building offset task don't depend on any task.

Simply adding this tasks result in a loss of performance (I perform 2
consecutive factorization in my test):
shipsec_0.9.2_heft_8CPU_0CUDA:----- sopalin time 11.219737
shipsec_0.9.2_heft_8CPU_0CUDA:----- sopalin time 6.038163

shipsec_0.9.2_heft_8CPU_0CUDA_SPARSE:----- sopalin time 34.779351
shipsec_0.9.2_heft_8CPU_0CUDA_SPARSE:----- sopalin time 22.334903

Even if I use an empty task doing nothing :
void
starpu_init_strides(void * buffers[], void * _args)
{
}
shipsec_0.9.2_heft_8CPU_0CUDA_SPARSE_EMPTY:----- sopalin time 29.753910
shipsec_0.9.2_heft_8CPU_0CUDA_SPARSE_EMPTY:----- sopalin time 20.411781

I don't understand how i can loose so much time by adding this tasks.

I'm using 0.9.2 version of StarPU because I noticed regression with
trunk version :
shipsec_r5550_heft_8CPU_0CUDA:----- sopalin time 33.981822
shipsec_r5550_heft_8CPU_0CUDA:----- sopalin time 33.579352

Do you have any clue ?

I also noticed on all my Gantt diagramms (even with the best ones) that
I have one task isolated, a large amount of "blocked" time and then my task
(I can send Gantts)

I hope my explanations are understandable,

Have a nice day,

XL.









Archives gérées par MHonArc 2.6.19+.

Haut de le page