Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] GPU issue with r12137

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] GPU issue with r12137


Chronologique Discussions 
  • From: Xavier Lacoste <xavier.lacoste@inria.fr>
  • To: Samuel Thibault <samuel.thibault@ens-lyon.org>
  • Cc: starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] GPU issue with r12137
  • Date: Thu, 3 Apr 2014 13:44:00 +0200
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

I tried the new revision (12533) and still got wrong results...
That is strange because this revision completly include the fixing patch

I did a bissection between 12511 + patch to 12533 + patch and the first bad commit is 12533....

There may be a problem with the part of the commit which differ from the patch :

--- a/src/datawizard/coherency.c
+++ b/src/datawizard/coherency.c
@@ -172,7 +172,13 @@ static int worker_supports_direct_access(unsigned node, unsigned handling_node)
                        enum starpu_node_kind kind = starpu_node_get_kind(handling_node);
                        /* GPUs not always allow direct remote access: if CUDA4
                         * is enabled, we allow two CUDA devices to communicate. */
-                       return kind == STARPU_CPU_RAM || kind == STARPU_CUDA_RAM;
+                       return
+#if 0
+                               /* CUDA does not seem very safe with concurrent
+                                * transfer queueing, avoid queueing from CPUs */
+                               kind == STARPU_CPU_RAM ||
+#endif
+                               kind == STARPU_CUDA_RAM;
                }

I runned the test several time because i didn't believe this can be the cause....

Indeed doing :

diff --git a/src/datawizard/coherency.c b/src/datawizard/coherency.c
index 6e67156..8e82163 100644
--- a/src/datawizard/coherency.c
+++ b/src/datawizard/coherency.c
@@ -172,13 +172,13 @@ static int worker_supports_direct_access(unsigned node, unsigned handling_node)
                        enum starpu_node_kind kind = starpu_node_get_kind(handling_node);
                        /* GPUs not always allow direct remote access: if CUDA4
                         * is enabled, we allow two CUDA devices to communicate. */
-                       return
-#if 0
+                       return kind ==
+#if 1
                                /* CUDA does not seem very safe with concurrent
                                 * transfer queueing, avoid queueing from CPUs */
-                               kind == STARPU_CPU_RAM ||
+                               STARPU_CPU_RAM ||
 #endif
-                               kind == STARPU_CUDA_RAM;
+                               STARPU_CUDA_RAM;
                }
 #else
                        /* Direct GPU-GPU transfers are not allowed in general */

solves the issue.

XL.

Le 3 avr. 2014 à 09:05, Xavier Lacoste <xavier.lacoste@inria.fr> a écrit :

Thanks !

XL.

Le 2 avr. 2014 à 19:09, Samuel Thibault <samuel.thibault@ens-lyon.org> a écrit :

Samuel Thibault, le Wed 02 Apr 2014 18:51:34 +0200, a écrit :
Ok, I have reverted r12137/12138, then.

(I mean, the "use non-zero streams on CPU workers" part)

Samuel

_______________________________________________
Starpu-devel mailing list
Starpu-devel@lists.gforge.inria.fr
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel




Archives gérées par MHonArc 2.6.19+.

Haut de le page