Objet : Developers list for StarPU
Archives de la liste
- From: Xavier Lacoste <xavier.lacoste@inria.fr>
- To: Sylvain HENRY <sylvain.henry@inria.fr>
- Cc: starpu-devel@lists.gforge.inria.fr
- Subject: Re: [Starpu-devel] Assert fail with 8659
- Date: Fri, 15 Feb 2013 10:20:14 +0100
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
It seem I didn't rebuilt my code correctly when changing StarPU.
The assert() is gone...
Now I have a deadlock in starpu_shutdown to solve...
#0 0x00007fffef369d59 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007ffff7336a8b in starpu_task_wait_for_no_ready () at core/task.c:692
#2 0x00007ffff733ad68 in starpu_shutdown () at core/workers.c:947
#3 0x0000000000464c2f in po_starpu_submit_tasks (sopalin_data=0xce82d8) at sopalin/src/starpu_submit_tasks.c:1354
#4 0x0000000000452bd0 in po_sopalin_thread (m=0xc862e8, sopaparam=0xc86460) at sopalin/src/sopalin3d.c:1366
#5 0x0000000000437b1e in pastix_task_sopalin (pastix_data=0xc862e8, pastix_comm=0xb95cc0, n=12111, colptr=0xc89380, row=0x7ffff7fd0010, avals=<value optimized out>, b=0xc950d0,
rhsnbr=1, loc2glob=0x0) at sopalin/src/pastix.c:3506
#6 0x000000000043a913 in pastix (pastix_data=0x7fffffffc458, pastix_comm=0xb95cc0, n=12111, colptr=0xc89380, row=0x7ffff7fd0010, avals=0x7fffe6e1d010, perm=0xcc45d0,
invp=0xcd0320, b=0xc950d0, rhs=1, iparm=0x7fffffffc210, dparm=0x7fffffffc010) at sopalin/src/pastix.c:4862
#7 0x0000000000415772 in main (argc=14, argv=0x7fffffffc598) at simple.c:210
The strange thing is that I submit all tasks at once and the wait_for_all() returns correctly.
But nready is 274, this is why i'm stll in the cond_wait().
XL.
Le 15 févr. 2013 à 10:01, Sylvain HENRY a écrit :
_______________________________________________Hi,
Andra has the same bug so the problem is more likely in StarPU.
Cheers
Sylvain
Le 15/02/2013 09:54, Xavier Lacoste a écrit :
BED93F1F-D7FD-4E00-AE80-F7968BD404A9@inria.fr">Hello, I still have the assert() problem with my code on mirage001. I'll check in my code that I really give correct inputs to StarPU, the error can come from one of my changes...
I runned make check : 1 of 113 tests failed (2 tests were not run) If I understood well this is the one which failed : [starpu][starpu_init] Warning: StarPU was configured with --with-fxt, which slows down a bit [starpu][initialize_eager_center_policy] Warning: you are running the default eager scheduler, which is not very smart. Make sure to read the StarPU documentation about adding perfo rmance models in order to be able to use the dmda scheduler instead. [starpu][starpu_init] Warning: StarPU was configured with --with-fxt, which slows down a bit Warning: change output file name to /tmp/prof_file_lacoste_0, but some events have been saved in file /tmp/prof_file_lacoste_0 [starpu][initialize_eager_center_policy] Warning: you are running the default eager scheduler, which is not very smart. Make sure to read the StarPU documentation about adding perfo rmance models in order to be able to use the dmda scheduler instead. [starpu][starpu_init] Warning: StarPU was configured with --with-fxt, which slows down a bit Warning: change output file name to /tmp/prof_file_lacoste_0, but some events have been saved in file /tmp/prof_file_lacoste_0 [starpu][initialize_eager_center_policy] Warning: you are running the default eager scheduler, which is not very smart. Make sure to read the StarPU documentation about adding perfo rmance models in order to be able to use the dmda scheduler instead. All CUDA-capable devices are busy or unavailable oops in init_context (drivers/cuda/driver_cuda.c:240)... 29: unload of CUDA runtime failed oops in init_context (drivers/cuda/driver_cuda.c:240)... 29: unload of CUDA runtime failed [starpu][abort] drivers/cuda/driver_cuda.c:582 starpu_cuda_report_error [starpu][abort] drivers/cuda/driver_cuda.c:582 starpu_cuda_report_error [error] `./main/restart' killed with signal 6; test marked as failed while looking for core file of ./main/restart: core: No such file or directory #Execution_time_in_seconds 5.340971 ./main/restart FAIL: main/restart XL. Le 14 févr. 2013 à 20:44, Nathalie Furmento a écrit :I was not able to reproduce it running make check on StarPU on my laptop. If the problem is still there Xavier, please send us the output of config.log, and the configuration of your machine (namely which machine you are using on plafrim). Cheers, Nathalie On Feb 14, 20:29, Samuel Thibault wrote:Xavier Lacoste, le Thu 14 Feb 2013 18:07:31 +0100, a écrit :The compilation passed with this configure line : ./configure --prefix=/home/lacoste/starpu-trunk-cuda --with-fxt --enable-maxcpus=160 --enable-max-sched-ctxs=30 CC=gcc CXX=g++ F77=gfortran --disable-opencl --with-cuda-dir=/opt/cluster/gpu/cuda/latest --with-cuda-lib-dir=/opt/cluster/gpu/cuda/latest/lib64/And is the assertion failure gone away? Samuel
_______________________________________________ Starpu-devel mailing list Starpu-devel@lists.gforge.inria.fr http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel
Starpu-devel mailing list
Starpu-devel@lists.gforge.inria.fr
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel
- [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Nathalie Furmento, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Nathalie Furmento, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Samuel Thibault, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Nathalie Furmento, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 15/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Sylvain HENRY, 15/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 15/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Nathalie Furmento, 15/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 15/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Andra Hugo, 15/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 15/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Andra Hugo, 15/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 15/02/2013
- Re: [Starpu-devel] hang with 8659, Samuel Thibault, 15/02/2013
- Re: [Starpu-devel] hang with 8659, Xavier Lacoste, 15/02/2013
- Re: [Starpu-devel] hang with 8659, Xavier Lacoste, 15/02/2013
- Re: [Starpu-devel] hang with 8659, Xavier Lacoste, 19/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Nathalie Furmento, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Xavier Lacoste, 14/02/2013
- Re: [Starpu-devel] Assert fail with 8659, Nathalie Furmento, 14/02/2013
Archives gérées par MHonArc 2.6.19+.