Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] Temporary buffer initialization

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] Temporary buffer initialization


Chronologique Discussions 
  • From: Xavier Lacoste <xavier.lacoste@inria.fr>
  • To: Samuel Thibault <samuel.thibault@ens-lyon.org>
  • Cc: starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] Temporary buffer initialization
  • Date: Thu, 17 Apr 2014 18:47:10 +0200
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hello,

I keep getting a assert() error when trying to use the new feature.
datawizard/reduction.c:54: _starpu_redux_init_data_replicate: Assertion `init_cl' failed.

I must have done something wrong but I can't get what...

I replaced my old setting (allocating/memset the data on the owner node) like this :
@@ -41,17 +76,20 @@ starpu_zregister_fanin(SolverMatrix * solvmtx,
         }
         for (fanin = ffanin; fanin < lfanin; fanin++) {
             if (clustnum == solvmtx->clustnum) {
-                MALLOC_INTERN(fanin->coeftab,
-                              fanin->stride*cblk_colnbr(fanin),
-                              pastix_complex64_t);
-                memset(fanin->coeftab, 0,
-                       fanin->stride*cblk_colnbr(fanin)*sizeof(pastix_complex64_t));
-                starpu_matrix_data_register(Lhandle, 0,
-                                            (uintptr_t)fanin->coeftab,
+                starpu_matrix_data_register(Lhandle, -1,
+                                            (uintptr_t)NULL,
                                             (uint32_t)fanin->stride,
                                             (uint32_t)fanin->stride,
                                             cblk_colnbr(fanin),
                                             sizeof(pastix_complex64_t));
+                starpu_data_set_reduction_methods(*Lhandle, NULL,
+                                                  &starpu_zfanin_init_codelet);
             } else {
                 starpu_matrix_data_register(Lhandle, -1,
                                             (uintptr_t)NULL,


with this to define the codelet :

+void starpu_zfanin_init_cpu_func(void *descr[], void *cl_arg)
+{
+    pastix_complex64_t *L      = (pastix_complex64_t*)STARPU_MATRIX_GET_PTR(descr[0]);
+    pastix_int_t        stride = STARPU_MATRIX_GET_LD(descr[0]);
+    pastix_int_t        ncol   = STARPU_MATRIX_GET_NX(descr[0]);
+    memset(L, 0, stride*ncol*sizeof(pastix_complex64_t));
+}
+
+#ifdef STARPU_USE_CUDA
+void starpu_zfanin_init_cuda_func(void *descr[], void *cl_arg)
+{
+    pastix_complex64_t *L      = (pastix_complex64_t*)STARPU_MATRIX_GET_PTR(descr[0]);
+    pastix_int_t        stride = STARPU_MATRIX_GET_LD(descr[0]);
+    pastix_int_t        ncol   = STARPU_MATRIX_GET_NX(descr[0]);
+    cudaMemsetAsync(L, 0, stride*ncol*sizeof(pastix_complex64_t), starpu_cuda_get_local_stream());
+}
+#endif
+
+static struct starpu_codelet starpu_zfanin_init_codelet =
+{
+    .where = STARPU_CPU|STARPU_CUDA,
+    .cpu_funcs = {starpu_zfanin_init_cpu_func, NULL},
+    //.cpu_funcs_name = {"starpu_zfanin_init_cpu_func", NULL},
+#ifdef STARPU_USE_CUDA
+    .cuda_funcs = {starpu_zfanin_init_cuda_func, NULL},
+    .cuda_flags = {STARPU_CUDA_ASYNC},
+#endif
+#ifdef STARPU_USE_OPENCL
+    .opencl_funcs = {init_opencl_func, NULL},
+#endif
+    .modes = {STARPU_W},
+    .nbuffers = 1,
+    .name = "init",
+};
+

Do you see any obvious reason this doesn't work ?

If you want to have the hands on the code (all the chmod should be OK, I hope...):
module load scm/git/1.8.1.2 editor/emacs scm/mercurial/2.5-rc scm/svn/1.7.8 build/cmake/2.8.11.2
module load scm/git/1.8.1.2 hardware/hwloc/1.8.1 trace/fxt/0.2.13
module load compiler/intel/13.4.183
module load mpi/openmpi/1.6.5
module load partitioning/scotch/int64/5.1.12b
module load editor/emacs/24.3
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/lustre/lacoste/starpu-install-nocuda/lib/pkgconfig
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/lustre/lacoste/starpu-install-nocuda/lib

cp -r mateo70/pastix/ pastix
cd pastix/build

# if you want to clean all build tool
# rm -rf *
# cmake ~/mateo70/pastix/ -DSCOTCH_DIR=$SCOTCH_DIR -DPASTIX_WITH_STARPU=ON -DPASTIX_WITH_STARPU_PROFILING=ON -DPASTIX_DUMP_CBLK=OFF -DPASTIX_WITH_MPI=ON

make -j 8 
export PASTIX_STARPU_FANIN=1 
mpirun -np 2  ./example/simple -lap 1000 -iparm IPARM_STARPU API_YES

Regards,

XL

Le 17 avr. 2014 à 16:02, Xavier Lacoste <xavier.lacoste@inria.fr> a écrit :




Le 17 avr. 2014 à 15:53, Samuel Thibault <samuel.thibault@ens-lyon.org> a écrit :

Hello,

Xavier Lacoste, le Thu 17 Apr 2014 14:12:45 +0200, a écrit :
Le 17 avr. 2014 à 11:51, Samuel Thibault <samuel.thibault@ens-lyon.org> a écrit :
Xavier Lacoste, le Wed 16 Apr 2014 14:59:10 +0200, a écrit :
I want to have a temporary buffer, allocated by starpu, and initialized (to
zero).
Some task will add values in it afterward and then it will be send to an other
process for a remote task.
These local tasks can be executed in any order.

Well, this looks like a reduction, won't STARPU_REDUX work for you?

Yes it's a reduction but where only part of the datum is touch in each task. So i didn't want all CPUs to perform an addition on the whole buffer when I know which part will be touch. And I don't want a buffer to be allocated per worker.

Ok.  I have added the support to the 1.1 branch and trunk, it is now a
matter of using starpu_data_set_reduction_methods to provide with the
init codelet, and then you'll be allowed to use RW on an uninitialized
data, just liked you'd be in REDUX mode.
Thanks a lot, i'll try that !

XL.

Samuel





Archives gérées par MHonArc 2.6.19+.

Haut de le page