starpu-devel - Re: [Starpu-devel] StarPU fails when using dmda/pheft with more than 1 GPU

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] StarPU fails when using dmda/pheft with more than 1 GPU

From: Samuel Thibault <samuel.thibault@ens-lyon.org>
To: David Pereira <david_sape@hotmail.com>
Cc: "starpu-devel@lists.gforge.inria.fr" <starpu-devel@lists.gforge.inria.fr>
Subject: Re: [Starpu-devel] StarPU fails when using dmda/pheft with more than 1 GPU
Date: Mon, 22 Sep 2014 12:10:52 +0200
List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hello,

David Pereira, le Sat 20 Sep 2014 23:30:27 +0100, a écrit :
> I found the reason of the unspecified launch failure. It turns out that it
> was
> from destroying a cusparse handle (twice).
> Is there a way to initialize cusparse across all GPU devices the same way
> we do
> with cublas (starpu_cublas_init)?

Yes, this is described in details in the documentation, under “How To
Initialize A Computation Library Once For Each Worker?”

Basically you want to call starpu_execute_on_each_worker, which is
actually exactly what starpu_cublas_init does.

> > > Also, is it possible that a CUDA program runs faster when using StarPU
> > > (using one GPU) than running the program without the framework? The
> > > StarPU version has parallel tasks unlike the version without the
> > > framework (which does not use streams).
> >
> > StarPU implements a lot of optimisations such as overlapping data
> > transfers with computations, which are often not easy to code directly
> > in the application. The parallel implementation of tasks can also of
> > course considerably reduce the critical path in the task graph.
>
> How can overlapping data transfers with computations improve execution time
> of
> the StarPU version (using only 1 GPU)
> over running only in the GPU?, since my algorithm runs completely in the GPU
> without transfers to the CPU?

Ah, then I don't think StarPU could gain much. I guess you could check
on the trace the details of what is happening, and compare with the
execution without using StarPU.

> Also, even with parallel implementation of tasks, if I only execute on a
> single
> GPU (without concurrent kernels), I think I'm not supposed to have better
> results with StarPU (which I have).

Yes.

Samuel

[Starpu-devel] StarPU fails when using dmda/pheft with more than 1 GPU, David Pereira, 13/09/2014
- Re: [Starpu-devel] StarPU fails when using dmda/pheft with more than 1 GPU, Samuel Thibault, 15/09/2014
  - Re: [Starpu-devel] StarPU fails when using dmda/pheft with more than 1 GPU, David Pereira, 21/09/2014
    - Re: [Starpu-devel] StarPU fails when using dmda/pheft with more than 1 GPU, Samuel Thibault, 22/09/2014

Archives gérées par MHonArc 2.6.19+.

Archives de la liste

Re: [Starpu-devel] StarPU fails when using dmda/pheft with more than 1 GPU