Accéder au contenu.
Menu Sympa

starpu-devel - [Starpu-devel] increment_redux_v2

Objet : Developers list for StarPU

Archives de la liste

[Starpu-devel] increment_redux_v2


Chronologique Discussions 
  • From: Olivier Aumage <olivier.aumage@inria.fr>
  • To: starpu-devel@lists.gforge.inria.fr
  • Subject: [Starpu-devel] increment_redux_v2
  • Date: Fri, 9 Sep 2011 15:50:35 +0200
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hi,

I am frequently experiencing hangs with the 'datawizard/increment_redux_v2'
test of the StarPU test suite.
The hang frequency is about 1 hang out of 5 runs with '--enable-debug'
enabled. Frequency is 1 hang out of 3 runs without --enable-debug

The machine used is a regular node from the PlaFRIM machine without GPU (e.g.
a 'fourmi' node).

Module list:
------------
1) tools/svn/1.6.11 6) compiler/gcc/4.6.0
2) lib/gmp/5.0.1 7) tools/autoconf/2.68
3) lib/mpfr/3.0.0 8) tools/automake/1.11.1
4) lib/mpc/0.8.2 9) tools/libtool/2.4
5) lib/libelf/0.8.13

StarPU SVN revision:
--------------------
4083

Configure settings:
-------------------
../trunk/configure --prefix=/home/aumage/SVN/StarPU/install --disable-cuda
--enable-debug

Symptoms
--------
- The program hangs
- Analysing the CTRL-/ core dump always show 2 threads lefts. The main thread
is waiting on pthread_join. The other thread is waiting on a spinlock.

GDB core dump analysis after CTRL-/:
------------------------------------
Core was generated by `./increment_redux_v2'.
Program terminated with signal 3, Quit.
#0 0x00007f0b61ca37b5 in pthread_join () from /lib64/libpthread.so.0
(gdb) info threads
2 Thread 606 0x00007f0b61ca7672 in ?? () from /lib64/libpthread.so.0
* 1 Thread 605 0x00007f0b61ca37b5 in pthread_join () from
/lib64/libpthread.so.0
(gdb) bt
#0 0x00007f0b61ca37b5 in pthread_join () from /lib64/libpthread.so.0
#1 0x00007f0b62899eaa in _starpu_terminate_workers (config=0x7f0b62ad1480)
at ../../trunk/src/core/workers.c:405
#2 0x00007f0b6289a09a in starpu_shutdown () at
../../trunk/src/core/workers.c:481
#3 0x0000000000400bd1 in main ()
(gdb) thread 2
[Switching to thread 2 (Thread 606)]#0 0x00007f0b61ca7672 in ?? () from
/lib64/libpthread.so.0
(gdb) bt
#0 0x00007f0b61ca7672 in ?? () from /lib64/libpthread.so.0
#1 0x00007f0b628966ff in _starpu_spin_lock (lock=0x603b40) at
../../trunk/src/common/starpu_spinlock.c:72
#2 0x00007f0b6289ebce in _starpu_notify_data_dependencies (handle=0x603b30)
at ../../trunk/src/core/dependencies/data_concurrency.c:336
#3 0x00007f0b628abc18 in _starpu_release_data_on_node (handle=0x603b30,
default_wt_mask=0, replicate=0x603b70) at
../../trunk/src/datawizard/coherency.c:474
#4 0x00007f0b628abfd9 in _starpu_push_task_output (task=0x7f0b4c1f1310,
mask=0) at ../../trunk/src/datawizard/coherency.c:616
#5 0x00007f0b628bfd5e in execute_job_on_cpu (j=0x7f0b4c1f1460,
cpu_args=0x7f0b62ad1640, is_parallel_task=0, rank=0,
perf_arch=STARPU_CPU_DEFAULT)
at ../../trunk/src/drivers/cpu/driver_cpu.c:72
#6 0x00007f0b628c02ce in _starpu_cpu_worker (arg=0x7f0b62ad1640) at
../../trunk/src/drivers/cpu/driver_cpu.c:182
#7 0x00007f0b61ca3070 in start_thread () from /lib64/libpthread.so.0
#8 0x00007f0b6260110d in clone () from /lib64/libc.so.6
#9 0x0000000000000000 in ?? ()
(gdb)
%-----

--
Olivier






Archives gérées par MHonArc 2.6.19+.

Haut de le page