Objet : Developers list for StarPU
Archives de la liste
- From: Mathieu Faverge <mathieu.faverge@inria.fr>
- To: Ian Masliah <ian.masliah@inria.fr>, Samuel Thibault <samuel.thibault@inria.fr>, starpu-devel@lists.gforge.inria.fr
- Subject: Re: [Starpu-devel] Deadlock with starpu_calibrate
- Date: Sun, 4 Dec 2016 20:29:41 +0100
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Is there a way to ask starpu to
calibrate only two GPUs ? or just one ?
Thanks Mathieu Le 04/12/2016 à 20:07, Ian Masliah a écrit : I didn't find a solution to the problem
unfortuntaley and it happened on multiple machines with K40 and
K20s. I tried to recompile with a different
CUDA toolkit but it didn't change anything for me so I'm
not sure.
I did try to disable GPU_DIRECT with export
STARPU_ENABLE_CUDA_GPU_GPU_DIRECT=0
but it didn't work
either at that time.
2016-12-04 20:02 GMT+01:00 Samuel
Thibault <samuel.thibault@inria.fr>:
Mathieu Faverge, on Sun 04 Dec 2016 19:53:01 +0100, wrote: > Here it is: > #0 0x00007ffff7ff9b24 in clock_gettime () > #1 0x00007ffff7858ea6 in clock_gettime () from /lib64/librt.so.1 > #2 0x00007ffff3be2f1e in ?? () from /usr/lib64/nvidia/libcuda.so.1 > #3 0x00007ffff3c70325 in ?? () from /usr/lib64/nvidia/libcuda.so.1 > #4 0x00007ffff3ba890e in ?? () from /usr/lib64/nvidia/libcuda.so.1 > #5 0x00007ffff3ba8b01 in ?? () from /usr/lib64/nvidia/libcuda.so.1 > #6 0x00007ffff3ae192a in ?? () from /usr/lib64/nvidia/libcuda.so.1 > #7 0x00007ffff3c0b648 in cuCtxSynchronize () from > /usr/lib64/nvidia/libcuda.so.1 > #8 0x00007ffff6dca179 in ?? () from /opt/cuda/8.0/lib64/libcudart.so.8.0 > #9 0x00007ffff6deb5f9 in cudaThreadSynchronize () from > /opt/cuda/8.0/lib64/libcudart.so.8.0 ... > And if you do a strace, the number of calls to clock_gettime is just insane. So it really looks like CUDA getting stuck in there. Not StarPU's fault then. You could try to set export STARPU_ENABLE_CUDA_GPU_GPU_DIRECT=0 in case CUDA has troubles with GPU direct. Mathieu Faverge, on Sun 04 Dec 2016 19:59:55 +0100, wrote: > Ok I'll check but it might just be that I didn't use the same cuda on the > compilation and compute node. That could be a problem indeed, I'm not really surprised by CUDA bugs any more :) Samuel -- -- Mathieu Faverge Maitre de conférence / Associate Professor Institut Polytechnique de Bordeaux - ENSEIRB-Matmeca INRIA Bordeaux - Sud-Ouest, HiePACS Team 200 avenue de la vielle tour 33405 Talence Cedex Phone: (+33) 5 24 57 40 73 |
- [Starpu-devel] Deadlock with starpu_calibrate, Mathieu Faverge, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Samuel Thibault, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Mathieu Faverge, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Samuel Thibault, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Ian Masliah, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Mathieu Faverge, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Mathieu Faverge, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Samuel Thibault, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Mathieu Faverge, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Samuel Thibault, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Mathieu Faverge, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Ian Masliah, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Samuel Thibault, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Mathieu Faverge, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Mathieu Faverge, 04/12/2016
- Re: [Starpu-devel] Deadlock with starpu_calibrate, Samuel Thibault, 04/12/2016
Archives gérées par MHonArc 2.6.19+.