Objet : Developers list for StarPU
Archives de la liste
Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed.
Chronologique Discussions
- From: Amani Alonazi <amani.alonazi@kaust.edu.sa>
- To: starpu-devel@lists.gforge.inria.fr, Brice Goglin <Brice.Goglin@inria.fr>
- Cc: Hatem Ltaief <Hatem.Ltaief@kaust.edu.sa>
- Subject: Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed.
- Date: Wed, 27 Mar 2019 16:06:25 +0300
- Authentication-results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=amani.alonazi@kaust.edu.sa; spf=Pass smtp.mailfrom=amani.alonazi@kaust.edu.sa; spf=None smtp.helo=postmaster@mail-ed1-f47.google.com
- Ironport-phdr: 9a23:mTFKMh/AOGdWgP9uRHKM819IXTAuvvDOBiVQ1KB32+0cTK2v8tzYMVDF4r011RmVBN2du6gP17GempujcFRI2YyGvnEGfc4EfD4+ouJSoTYdBtWYA1bwNv/gYn9yNs1DUFh44yPzahANS47xaFLIv3K98yMZFAnhOgppPOT1HZPZg9iq2+yo9JDffhlEiCC+bL9sIxm7ogvcvdQKjIV/Lao81gHHqWZSdeRMwmNoK1OTnxLi6cq14ZVu7Sdete8/+sBZSan1cLg2QrJeDDQ9LmA6/9brugXZTQuO/XQTTGMbmQdVDgff7RH6WpDxsjbmtud4xSKXM9H6QawyVD+/9KpgVgPmhzkbOD446GHXi9J/jKRHoBK6uhdzx5fYbJyJOPZie6/Qe84RS2hcUcZLTyFODY28YIkPAeQPPuhWspfzqEcVoBSkGQWhHvnixiNUinL026AxzuQvERvB3AwlB98DrHHUo8/zNKcTTOu40K3IzTLFb/xM2Db96ZXDfxc7rvGJR71wd8vRxVM1GAPBiVWQr5bqPjKU1ukWsGiU9fdgVfmzi2I9tQ5+vyWvyt03iobTn48YzE3P+yt+wIYwP9K4SUh7bMalEJtWrSGaNpF5TtksQ2Fyvisx174IuYajcSQU1JgqwwTTZv+HfoSS/B7vSOWcLS13iX9nfr+0mgy8/lK6yuLmU8m5yFZKoTRBktnLrn0N0gbc6smDSvdk4EiuxCuD2xnd6uxLI004j6XbK5kmwr4/kpocr17PETPxmEXzlKOWd0Mk9fa06+n/fLnqupuRO5V3hwz+KKgihNCzDOciPgUBXWWX4eG826fi/U39TrVKlPo2kqzBvZ/AIMQUvKi5Aw5P3ok57xa/CDGm384ZnHkGN19FewiIj5XyO1HSOvz3E+qwg0m2nDdw3f/KJqfhDYnVLnjfjLfheq5w61VAyAUp19Bf/49UBqgcL/3tRE/+qtjYAwQ9Mwy12ObnFM592pkRWWKBBa+ZKqzSvkGS6uIuJemMfo4VtyznJ/gr/f69xUM+zEQBdLOxwN4bZW61GtxiIl6FejzjjNAbHmpMvwwkTeWshkfGGRRJZn2/F4076TU6DsryJ53CTY3rr7uO2Ca9NpxQfGFPTF6WRyTGbYKBDt4JYSSTauZllToNR7npH4Yk3Bev8gb+zLxuPOv8+SQFro6l2dRoofbaw0JhvQdoBtiQhjneB1p/mXkFEnpvhPgm8B5Nj2yb2K09uMR2UNla5vdHSAA/bMKOzOVnEMy0Vw7cOMyAGg//HoeWRAopR9d0+OcgJl5nEoz+3BPKwjG2RbIZivqQDc5sq/+O7z3KP894jk3++uwhgl0hGJYdMGSnguty+1GWCdKW1UqekKmueOIX2yufrGo=
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Reserving the entire node doesn't solve the issue with the current system. I cannot disable the cgroups of resources. I disabled hwloc but the error now from pthreads:
starpu_pthread_t self = starpu_pthread_self();
res = pthread_setaffinity_np(self, sizeof(aff_mask), &aff_mask);
if (res)
{
const char *msg = strerror(res);
_STARPU_MSG("pthread_setaffinity_np: %s\n", msg);
STARPU_ABORT(); << fails here
}
res = pthread_setaffinity_np(self, sizeof(aff_mask), &aff_mask);
if (res)
{
const char *msg = strerror(res);
_STARPU_MSG("pthread_setaffinity_np: %s\n", msg);
STARPU_ABORT(); << fails here
}
any idea how to solve it?
On Wed, Mar 27, 2019 at 12:38 AM Amani Alonazi <amani.alonazi@kaust.edu.sa> wrote:
Sure that’s a better easy solution.Many thanks!A.I don't know, I work on the hwloc side, I don't know how StarPU should be fixed for this case.
For now, you may avoid the issue by reserving entire nodes.
Brice
Le 26/03/2019 à 22:20, Amani Alonazi a écrit :
Hi!
Yes, there are CPU-less NUMA nodes because of cgroups. Will configuring starpu without hwloc solve the issue?
Many thanks!
On Tue, Mar 26, 2019 at 11:53 PM Brice Goglin <Brice.Goglin@inria.fr> wrote:
Hello
What does lstopo say on this machine? (inside the same reservation)
Do you have CPU-less NUMA nodes? (maybe because of cgroups)
That's the only idea that came to my mind when looking at the code.
Brice
Le 26/03/2019 à 21:46, Amani Alonazi a écrit :
Dear Samuel and Starpu-dev,
I am facing an error at the initilization step of StarPU. The error is: core/perfmodel/perfmodel_bus.c:598: find_cpu_from_numa_node: Assertion `current' failed.
raise () from /lib64/libc.so.6
abort () from /lib64/libc.so.6
in __assert_fail_base () from /lib64/libc.so.6
in __assert_fail () from /lib64/libc.so.6
in find_cpu_from_numa_node (obj=0x49a5b7f0) at core/perfmodel/perfmodel_bus.c:598
in measure_bandwidth_between_numa_nodes_and_dev (dev=0, dev_timing_per_numanode=0x200000357338 <cudadev_timing_per_numa>, type=0x20000030cf50 "CUDA") at core/perfmodel/perfmodel_bus.c:633
in measure_bandwidth_between_host_and_dev (dev=0, dev_timing_per_numa=0x200000357338 <cudadev_timing_per_numa>, type=0x20000030cf50 "CUDA") at core/perfmodel/perfmodel_bus.c:654
in benchmark_all_gpu_devices () at core/perfmodel/perfmodel_bus.c:784
in generate_bus_affinity_file () at core/perfmodel/perfmodel_bus.c:1029
in _starpu_bus_force_sampling () at core/perfmodel/perfmodel_bus.c:2931
in check_bus_config_file () at core/perfmodel/perfmodel_bus.c:2036
in _starpu_load_bus_performance_files () at core/perfmodel/perfmodel_bus.c:2959
in starpu_initialize (user_conf=0x490f02b0, argc=0x0, argv=0x0) at core/workers.c:1400
in starpu_init (user_conf=0x490f02b0) at core/workers.c:1213
How can I solve it? Is it connected with hwloc?
Many thanks,
--
Amani
This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.
_______________________________________________ Starpu-devel mailing list Starpu-devel@lists.gforge.inria.fr https://lists.gforge.inria.fr/mailman/listinfo/starpu-devel
This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.
--
Amani
- [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Amani Alonazi, 26/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Brice Goglin, 26/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Amani Alonazi, 26/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Brice Goglin, 26/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Amani Alonazi, 26/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Amani Alonazi, 27/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Amani Alonazi, 26/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Samuel Thibault, 29/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Amani Alonazi, 29/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Brice Goglin, 26/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Amani Alonazi, 26/03/2019
- Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed., Brice Goglin, 26/03/2019
Archives gérées par MHonArc 2.6.19+.