Objet : Developers list for StarPU
Archives de la liste
- From: Yizhou Qian <yizhou96@gmail.com>
- To: starpu-devel@lists.gforge.inria.fr
- Subject: [Starpu-devel] Data Distribution using Starpumpi
- Date: Sat, 21 Dec 2019 00:28:38 +0800
- Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=yizhou96@gmail.com; spf=Pass smtp.mailfrom=yizhou96@gmail.com; spf=None smtp.helo=postmaster@mail-lf1-f46.google.com
- Ironport-phdr: 9a23:wT52wxxKrSZsIvzXCy+O+j09IxM/srCxBDY+r6Qd1OgRIJqq85mqBkHD//Il1AaPAdyAragd0KGG7ujJYi8p2d65qncMcZhBBVcuqP49uEgeOvODElDxN/XwbiY3T4xoXV5h+GynYwAOQJ6tL1LdrWev4jEMBx7xKRR6JvjvGo7Vks+7y/2+94fcbglVijexe61+IAi4oAnetcQbgZZpJ7osxBfOvnZGYfldy3lyJVKUkRb858Ow84Bm/i9Npf8v9NNOXLvjcaggQrNWEDopM2Yu5M32rhbDVheA5mEdUmoNjBVFBRXO4QzgUZfwtiv6sfd92DWfMMbrQ704RSiu4qF2QxLulSwJNSM28HvPh8N/jKxVrhGvqQFhzYHIe4yaLuZyc7nHcN8GWWZMXMBcXDFBDIOmaIsPCvIMMuRZr4j8p1sOqga+DhS3CuPu0DBIgGL90Ko60uQgEADG3AsgH88KvXnVt9j1O6ISXvq0zKnM1znMc/RW2TLk5YXObxsvoumMUKpufcbNzUQjDQDIg1WKpYD4Pj6Y1P4Bvmea4udmSOmhkXQoqxtrrTiq3sosipfGhoYSyl3c8CV22oc1JdmhRE5/b96oDYJcty+VOoZ3WM8iTGZouCE1yr0Cp5G3ZjQFyJMixxLHavyHdZaH4g77WeqPPTt1gGhpdbG/ihqo7ESty+/xWtO73VtLtiZFl8PDtnEJ1xzd8MiHTf5981+h2DaO1gHT6uZEIV0wmKfaMJMhzbswmYASsUTHBCP5hEL2jKqOekU+5ueo8/jnYqnhppKELI90lhvxMr42msyiGOg3LxYBX3aF9uS4z7Dj+Uz5QK5Wjv0tiKXZv57aJcMBpq62HQBZyIcj6xClDzenytsUh3cHLEgWMC6A2pP1MkvWPbX0AOmyh3yokSx33LbJMLr7DZiLL37Zkb6nc6wuxVRbzV8CzMJf4I8cM7AIJrqnS07jtNvFX0ERPAm9wuKhA9J4gNBNEVmTC7OUZfuB+WSD4fgidrHVNd0l/Q3lIv1g3MbAyHowmFsTZ66shMJFZ3WxH/AgKEKcMyO13oUxVFwStw97d9TEzUWYWGcKNXm3VqM4oDo8DdD+VNqRdsWWmLWEmRyDMNhWa2RBUA3eFH7pc8CAVa5JZn7MZMBmlTMAWP6qTIpzjRw=
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
[starpu][_starpu_select_src_node][assert failure] The data for the handle 0x2120a50 is requested, but the handle does not have a valid value. Perhaps some initialization task is missing?
Below is my full error message and source code:
error message:
[adncat@sh-107-47 ~/yizhou/Starpu]$ mpirun -n 2 ./cholesky_mpi 2 2
[starpu][compare_value_and_recalibrate] Current configuration does not match the bus performance model (CPUS: (stored) 1 != (current) 24), recalibrating...
[starpu][compare_value_and_recalibrate] Current configuration does not match the bus performance model (CPUS: (stored) 1 != (current) 24), recalibrating...
[starpu][compare_value_and_recalibrate] ... done
[starpu][compare_value_and_recalibrate] ... done
[starpu][_starpu_mpi_print_thread_level_support] MPI_Init_thread level = MPI_THREAD_SERIALIZED; Multiple threads may make MPI calls, but only one at a time.
Rank 0 of 2 ranks
Incrementing *a, *a=2
[starpu][_starpu_mpi_print_thread_level_support] MPI_Init_thread level = MPI_THREAD_SERIALIZED; Multiple threads may make MPI calls, but only one at a time.
Rank 1 of 2 ranks
/home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_select_src_node+0x348)[0x7fc9e2880c18]
/home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_create_request_to_fetch_data+0xd0c)[0x7fc9e2881b6c]
/home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_fetch_data_on_node+0x10e)[0x7fc9e2881d8e]
/home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_fetch_task_input+0x113)[0x7fc9e2882c73]
/home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_cpu_driver_run_once+0xd1)[0x7fc9e28c9af1]
/home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_cpu_worker+0x2d)[0x7fc9e28c9d2d]
/lib64/libpthread.so.0(+0x7dd5)[0x7fc9e066cdd5]
/lib64/libc.so.6(clone+0x6d)[0x7fc9df5e402d]
[starpu][_starpu_select_src_node][assert failure] The data for the handle 0x2120a50 is requested, but the handle does not have a valid value. Perhaps some initialization task is missing?
cholesky_mpi: datawizard/coherency.c:70: _starpu_select_src_node: Assertion `src_node_mask != 0' failed.
[sh-107-47:284948] *** Process received signal ***
[sh-107-47:284948] Signal: Aborted (6)
[sh-107-47:284948] Signal code: (-6)
[sh-107-47:284948] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x7fc9e06745d0]
[sh-107-47:284948] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x7fc9df51c2c7]
[sh-107-47:284948] [ 2] /lib64/libc.so.6(abort+0x148)[0x7fc9df51d9b8]
[sh-107-47:284948] [ 3] /lib64/libc.so.6(+0x2f0e6)[0x7fc9df5150e6]
[sh-107-47:284948] [ 4] /lib64/libc.so.6(+0x2f192)[0x7fc9df515192]
[sh-107-47:284948] [ 5] /home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_select_src_node+0x398)[0x7fc9e2880c68]
[sh-107-47:284948] [ 6] /home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_create_request_to_fetch_data+0xd0c)[0x7fc9e2881b6c]
[sh-107-47:284948] [ 7] /home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_fetch_data_on_node+0x10e)[0x7fc9e2881d8e]
[sh-107-47:284948] [ 8] /home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_fetch_task_input+0x113)[0x7fc9e2882c73]
[sh-107-47:284948] [ 9] /home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_cpu_driver_run_once+0xd1)[0x7fc9e28c9af1]
[sh-107-47:284948] [10] /home/users/adncat/starpu/lib/libstarpu-1.3.so.1(_starpu_cpu_worker+0x2d)[0x7fc9e28c9d2d]
[sh-107-47:284948] [11] /lib64/libpthread.so.0(+0x7dd5)[0x7fc9e066cdd5]
[sh-107-47:284948] [12] /lib64/libc.so.6(clone+0x6d)[0x7fc9df5e402d]
[sh-107-47:284948] *** End of error message ***
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 0 on node sh-107-47 exited on signal 6 (Aborted).
Source code:
Best,
Yizhou
- [Starpu-devel] Data Distribution using Starpumpi, Yizhou Qian, 20/12/2019
- Re: [Starpu-devel] Data Distribution using Starpumpi, Samuel Thibault, 20/12/2019
Archives gérées par MHonArc 2.6.19+.