Accéder au contenu.
Menu Sympa

starpu-devel - [Starpu-devel] starpu-print-all-tasks output

Veuillez patienter...

starpu-devel@inria.fr

Objet : Developers list for StarPU

Archives de la liste

[Starpu-devel] starpu-print-all-tasks output


Chronologique Discussions 
  • From: Mirko Myllykoski <mirkom@cs.umu.se>
  • To: Starpu Devel <starpu-devel@lists.gforge.inria.fr>
  • Subject: [Starpu-devel] starpu-print-all-tasks output
  • Date: Fri, 28 Dec 2018 11:44:24 +0100
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=mirkom@cs.umu.se; spf=Pass smtp.mailfrom=mirkom@cs.umu.se; spf=None smtp.helo=postmaster@mail.cs.umu.se
  • Ironport-phdr: 9a23:ESs9sxKqV7cQ6JfW6dmcpTZWNBhigK39O0sv0rFitYgeKvjxwZ3uMQTl6Ol3ixeRBMOHs6IC07KempujcFRI2YyGvnEGfc4EfD4+ouJSoTYdBtWYA1bwNv/gYn9yNs1DUFh44yPzahANS47xaFLIv3K98yMZFAnhOgppPOT1HZPZg9iq2+yo9JDffwZFiCChbb9uMR67sRjfus4KjIV4N60/0AHJonxGe+RXwWNnO1eelAvi68mz4ZBu7T1et+ou+MBcX6r6eb84TaFDAzQ9L281/szrugLdQgaJ+3ART38ZkhtMAwjC8RH6QpL8uTb0u+ZhxCWXO9D9QKsqUjq+8ahkVB7oiD8GNzEn9mHXltdwh79frB64uhBz35LYbISTOfFjfK3SYMkaSHJOUclNWCJPDIOyYZUSAeQCMudXoZLwp0cMoBu8GQWgGPnixiFOi3Tr3aM6yeMhEQTe0QInHtIOqnfUrNLoP6kVUeG1yK3IzDXAb/NRwjf964nIeQ0mrPGJXLJwa8nRyU4qFw7eklqQspbpPy6Q1uQQqWSb9PRvWPuphmU6qA9xuiCiytoih4XVnI4Z1F7J+CFjzIs6OdG0UlB3bN68HJdOqy2WK5Z6T8YjTm5zpCo3z6MJtJu0cSQWx5kr2xvSZvmGfoWL5B/oSfyfLi1ihH1/fbKynxay/lakyu37TsS01UxFritBktXWqn8NzAbf6tWCSvtg5EuhwiiA2xjS6uFCP080ibLWJ4Mvz7IsjJYfr0rOEyvslEj5kKOabFso9+iw5+TieLrmp5ucN4FuigH5N6Qjgsm/AeUiMggNWGib/P+x26H4/UHjXbVKi+A2nrPCsJ/EPcgbvLK2AxdJ0oY/7BayFyym0MgCknkdNFJFZQuLgJX3NFHQPv/4Ceyyg1CtkDdw2/DGJaftAprDLnjEirfhcqhy51RdyAo119Bf5ohbBqsPIPLpCQfNs4njEhYjKxH87+v5BdE1gpgCUHiXH+qVPb3ftXeM5/kzOK+DaogPtzu7Kv4/5veog2VvynEHeqz8+JoNa3fwO+ViJ1SaZmHvg59VFGYQvQ07ZOfxzkCHTHhIaiDhDOoH+jgnBdf+Xs/4TYe3jenEhX/jR8wEViV9ElmJVEzQWcCBUvYIZjiVJ5Y4wDcfE6WkVsk62EP37VOo+/9cNuPRvxYgm9f7ztEsvr/Yjlcv8CEyFMnPizjQHVExpXsBQnoN5I46oUF5zQ7dg61xgvgeHscV+vZUFBw3Z8bR
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hi,

I have a problem where some MPI ranks get stuck on a starpu_task_wait_for_all function call. The starpu_mpi_task_wait_for_all function was causing problems earlier so I am calling the starpu_task_wait_for_all function followed by the starpu_mpi_barrier function. This code does not use several scheduling contexts.

I am struggling to interpret the output of the starpu-print-all-tasks GDB command (please see the end of this email). It appears that those MPI ranks that get stuck have several uncompleted tasks. All tasks are of the same type (some form of synchronization mechanism?) and all of them seem to have one uncompleted dependency. The status field hints that the dependency is a task.

Is this output inconsistent or should I look elsewhere for the bug?

Best Regards,
Mirko Myllykoski

=====================================================================

(gdb) starpu-print-all-tasks
task 0x555556ced5c0
StarPU Task (0x555556ced5c0)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556cedbc0>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556cedbc0)
task: <0x555556ced5c0>
submitted: <1>
terminated: <0>
job_id: <10624>
name: <_starpu_data_acquire_cb_pre>
task 0x555556cf1170
StarPU Task (0x555556cf1170)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556cf1770>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556cf1770)
task: <0x555556cf1170>
submitted: <1>
terminated: <0>
job_id: <10633>
name: <_starpu_data_acquire_cb_pre>
task 0x555556cf2960
StarPU Task (0x555556cf2960)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556cf2f60>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556cf2f60)
task: <0x555556cf2960>
submitted: <1>
terminated: <0>
job_id: <10637>
name: <_starpu_data_acquire_cb_pre>
task 0x555556cf6480
StarPU Task (0x555556cf6480)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556cf6a80>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556cf6a80)
task: <0x555556cf6480>
submitted: <1>
terminated: <0>
job_id: <10646>
name: <_starpu_data_acquire_cb_pre>
task 0x555556cf7c70
StarPU Task (0x555556cf7c70)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556cf8270>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556cf8270)
task: <0x555556cf7c70>
submitted: <1>
terminated: <0>
job_id: <10650>
name: <_starpu_data_acquire_cb_pre>
task 0x555556cfa960
StarPU Task (0x555556cfa960)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556cfaf60>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556cfaf60)
task: <0x555556cfa960>
submitted: <1>
terminated: <0>
job_id: <10657>
name: <_starpu_data_acquire_cb_pre>
task 0x555556cfe480
StarPU Task (0x555556cfe480)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556cfea80>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556cfea80)
task: <0x555556cfe480>
submitted: <1>
terminated: <0>
job_id: <10666>
name: <_starpu_data_acquire_cb_pre>
task 0x555556cffc70
StarPU Task (0x555556cffc70)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556d00270>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556d00270)
task: <0x555556cffc70>
submitted: <1>
terminated: <0>
job_id: <10670>
name: <_starpu_data_acquire_cb_pre>
task 0x555556d03790
StarPU Task (0x555556d03790)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556d03d90>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556d03d90)
task: <0x555556d03790>
submitted: <1>
terminated: <0>
job_id: <10679>
name: <_starpu_data_acquire_cb_pre>
task 0x555556d04f80
StarPU Task (0x555556d04f80)
name: <_starpu_data_acquire_cb_pre>
codelet: <(nil)>
callback: <0x7ffff3876ba8>
synchronous: <0>
execute_on_a_specific_worker: <0>
workerid: <0>
detach: <1>
destroy: <1>
regenerate: <0>
status: <STARPU_TASK_BLOCKED_ON_TASK>
job: <0x555556d05580>
ndeps: <1>
ndeps_completed: <0>
nsuccs: <0>
StarPU Job (0x555556d05580)
task: <0x555556d04f80>
submitted: <1>
terminated: <0>
job_id: <10683>
name: <_starpu_data_acquire_cb_pre>

=====================================================================



  • [Starpu-devel] starpu-print-all-tasks output, Mirko Myllykoski, 28/12/2018

Archives gérées par MHonArc 2.6.19+.

Haut de le page