starpu-devel - Re: [Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work

From: Samuel Thibault <samuel.thibault@ens-lyon.org>
To: Hush Zhou <hush.zhou@gmail.com>, starpu-devel@lists.gforge.inria.fr
Subject: Re: [Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work
Date: Thu, 17 Apr 2014 10:34:49 +0200
List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Samuel Thibault, le Tue 15 Apr 2014 15:31:36 +0200, a écrit :
> Hush Zhou, le Tue 15 Apr 2014 08:24:19 -0500, a écrit :
> > STARPU_NCPU=2 STARPU_NCUDA=2 STARPU_ENABLE_CUDA_GPU_GPU_DIRECT=1
> > ./cholesky_implicit
>
> > It seems that data didn't transfer directly from GPU to GPU. Is it
> > normal? How could I solve this problem?
>
> We never managed to get cudaMemcpy3DPeer working, even in very basic
> cases. Perhaps CUDA got its bugs fixed in the meanwhile, and we should
> try to make it work.

I got confirmation that the implementation of cudaMemcpy3DPeer in CUDA
is bogus and there is no cudaMemcpy2DPeer, and there's very little hope
that Nvidia will take the engineer time fix that. We could try to use
a series of cudaMemcpyPeerAsync to transfer the matrix piece line per
line, but that will have a very big overhead.

So in the end, to get GPU-GPU transfers with CUDA, the only way is to
allocate contiguous tiles, and not divide a big matrix into tiles with
stride.

Samuel

[Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work, Hush Zhou, 15/04/2014
- Re: [Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work, Samuel Thibault, 15/04/2014
  - Re: [Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work, Samuel Thibault, 17/04/2014

Archives gérées par MHonArc 2.6.19+.

Archives de la liste

Re: [Starpu-devel] STARPU_ENABLE_CUDA_GPU_GPU_DIRECT doesn't work