Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work


Chronologique Discussions 
  • From: Samuel Thibault <samuel.thibault@ens-lyon.org>
  • To: Hush Zhou <hush.zhou@gmail.com>, starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work
  • Date: Thu, 17 Apr 2014 10:34:49 +0200
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Samuel Thibault, le Tue 15 Apr 2014 15:31:36 +0200, a écrit :
> Hush Zhou, le Tue 15 Apr 2014 08:24:19 -0500, a écrit :
> > STARPU_NCPU=2 STARPU_NCUDA=2 STARPU_ENABLE_CUDA_GPU_GPU_DIRECT=1
> > ./cholesky_implicit
>
> > It seems that data didn't transfer directly from GPU to GPU. Is it
> > normal? How could I solve this problem?
>
> We never managed to get cudaMemcpy3DPeer working, even in very basic
> cases. Perhaps CUDA got its bugs fixed in the meanwhile, and we should
> try to make it work.

I got confirmation that the implementation of cudaMemcpy3DPeer in CUDA
is bogus and there is no cudaMemcpy2DPeer, and there's very little hope
that Nvidia will take the engineer time fix that. We could try to use
a series of cudaMemcpyPeerAsync to transfer the matrix piece line per
line, but that will have a very big overhead.

So in the end, to get GPU-GPU transfers with CUDA, the only way is to
allocate contiguous tiles, and not divide a big matrix into tiles with
stride.

Samuel





Archives gérées par MHonArc 2.6.19+.

Haut de le page