Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] Assert fail with 8659

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] Assert fail with 8659


Chronologique Discussions 
  • From: Andra Hugo <andra.hugo@inria.fr>
  • To: Xavier Lacoste <xavier.lacoste@inria.fr>
  • Cc: starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] Assert fail with 8659
  • Date: Fri, 15 Feb 2013 10:48:08 +0100 (CET)
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Are you sure you don't summit tasks to late?


De: "Xavier Lacoste" <xavier.lacoste@inria.fr>
À: "Sylvain HENRY" <sylvain.henry@inria.fr>
Cc: starpu-devel@lists.gforge.inria.fr
Envoyé: Vendredi 15 Février 2013 10:45:55
Objet: Re: [Starpu-devel] Assert fail with 8659

And this comes from my scheduling policy which is a modified version of the work stealing policy to get the initial location of tasks computed by PaStiX used.

Maybe I need to report some changes in it to make it work with the new StarPU revision ?
I'll check if there has been modification to work stealing policy recently.

XL.


Le 15 févr. 2013 à 10:20, Xavier Lacoste a écrit :

It seem I didn't rebuilt my code correctly when changing StarPU.
The assert() is gone...
Now I have a deadlock in starpu_shutdown to solve...

#0  0x00007fffef369d59 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffff7336a8b in starpu_task_wait_for_no_ready () at core/task.c:692
#2  0x00007ffff733ad68 in starpu_shutdown () at core/workers.c:947
#3  0x0000000000464c2f in po_starpu_submit_tasks (sopalin_data=0xce82d8) at sopalin/src/starpu_submit_tasks.c:1354
#4  0x0000000000452bd0 in po_sopalin_thread (m=0xc862e8, sopaparam=0xc86460) at sopalin/src/sopalin3d.c:1366
#5  0x0000000000437b1e in pastix_task_sopalin (pastix_data=0xc862e8, pastix_comm=0xb95cc0, n=12111, colptr=0xc89380, row=0x7ffff7fd0010, avals=<value optimized out>, b=0xc950d0, 
    rhsnbr=1, loc2glob=0x0) at sopalin/src/pastix.c:3506
#6  0x000000000043a913 in pastix (pastix_data=0x7fffffffc458, pastix_comm=0xb95cc0, n=12111, colptr=0xc89380, row=0x7ffff7fd0010, avals=0x7fffe6e1d010, perm=0xcc45d0, 
    invp=0xcd0320, b=0xc950d0, rhs=1, iparm=0x7fffffffc210, dparm=0x7fffffffc010) at sopalin/src/pastix.c:4862
#7  0x0000000000415772 in main (argc=14, argv=0x7fffffffc598) at simple.c:210

The strange thing is that I submit all tasks at once and the wait_for_all() returns correctly.
But nready is 274, this is why i'm stll in the cond_wait().

XL.


Le 15 févr. 2013 à 10:01, Sylvain HENRY a écrit :

Hi,

Andra has the same bug so the problem is more likely in StarPU.

Cheers
Sylvain

Le 15/02/2013 09:54, Xavier Lacoste a écrit :
BED93F1F-D7FD-4E00-AE80-F7968BD404A9@inria.fr">
Hello,

I still have the assert() problem with my code on mirage001.
I'll check in my code that I really give correct inputs to StarPU, the error can come from one of my changes...


I runned make check :

1 of 113 tests failed
(2 tests were not run)

If I understood well this is the one which failed :
[starpu][starpu_init] Warning: StarPU was configured with --with-fxt, which slows down a bit
[starpu][initialize_eager_center_policy] Warning: you are running the default eager scheduler, which is not very smart. Make sure to read the StarPU documentation about adding perfo
rmance models in order to be able to use the dmda scheduler instead.
[starpu][starpu_init] Warning: StarPU was configured with --with-fxt, which slows down a bit
Warning: change output file name to /tmp/prof_file_lacoste_0, but some events have been saved in file /tmp/prof_file_lacoste_0
[starpu][initialize_eager_center_policy] Warning: you are running the default eager scheduler, which is not very smart. Make sure to read the StarPU documentation about adding perfo
rmance models in order to be able to use the dmda scheduler instead.
[starpu][starpu_init] Warning: StarPU was configured with --with-fxt, which slows down a bit
Warning: change output file name to /tmp/prof_file_lacoste_0, but some events have been saved in file /tmp/prof_file_lacoste_0
[starpu][initialize_eager_center_policy] Warning: you are running the default eager scheduler, which is not very smart. Make sure to read the StarPU documentation about adding perfo
rmance models in order to be able to use the dmda scheduler instead.
All CUDA-capable devices are busy or unavailable
oops in init_context (drivers/cuda/driver_cuda.c:240)... 29: unload of CUDA runtime failed
oops in init_context (drivers/cuda/driver_cuda.c:240)... 29: unload of CUDA runtime failed
[starpu][abort] drivers/cuda/driver_cuda.c:582 starpu_cuda_report_error
[starpu][abort] drivers/cuda/driver_cuda.c:582 starpu_cuda_report_error
[error] `./main/restart' killed with signal 6; test marked as failed
while looking for core file of ./main/restart: core: No such file or directory
#Execution_time_in_seconds 5.340971 ./main/restart
FAIL: main/restart

XL.


Le 14 févr. 2013 à 20:44, Nathalie Furmento a écrit :

I was not able to reproduce it running make check on StarPU on my laptop. If the problem is still there Xavier, please send us the output of config.log, and the configuration of your machine (namely which machine you are using on plafrim).

Cheers,

Nathalie

On Feb 14, 20:29, Samuel Thibault wrote:
Xavier Lacoste, le Thu 14 Feb 2013 18:07:31 +0100, a écrit :
The compilation passed with this configure line :

./configure --prefix=/home/lacoste/starpu-trunk-cuda --with-fxt --enable-maxcpus=160 --enable-max-sched-ctxs=30 CC=gcc CXX=g++ F77=gfortran --disable-opencl --with-cuda-dir=/opt/cluster/gpu/cuda/latest --with-cuda-lib-dir=/opt/cluster/gpu/cuda/latest/lib64/
And is the assertion failure gone away?

Samuel

      

_______________________________________________
Starpu-devel mailing list
Starpu-devel@lists.gforge.inria.fr
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel


_______________________________________________
Starpu-devel mailing list
Starpu-devel@lists.gforge.inria.fr
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel

_______________________________________________
Starpu-devel mailing list
Starpu-devel@lists.gforge.inria.fr
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel


_______________________________________________
Starpu-devel mailing list
Starpu-devel@lists.gforge.inria.fr
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/starpu-devel




Archives gérées par MHonArc 2.6.19+.

Haut de le page