Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed.

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed.


Chronologique Discussions 
  • From: Amani Alonazi <amani.alonazi@kaust.edu.sa>
  • To: starpu-devel@lists.gforge.inria.fr, Brice Goglin <Brice.Goglin@inria.fr>
  • Cc: Hatem Ltaief <Hatem.Ltaief@kaust.edu.sa>
  • Subject: Re: [Starpu-devel] Error find_cpu_from_numa_node: Assertion `current' failed.
  • Date: Wed, 27 Mar 2019 16:06:25 +0300
  • Authentication-results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=amani.alonazi@kaust.edu.sa; spf=Pass smtp.mailfrom=amani.alonazi@kaust.edu.sa; spf=None smtp.helo=postmaster@mail-ed1-f47.google.com
  • Ironport-phdr: 9a23:mTFKMh/AOGdWgP9uRHKM819IXTAuvvDOBiVQ1KB32+0cTK2v8tzYMVDF4r011RmVBN2du6gP17GempujcFRI2YyGvnEGfc4EfD4+ouJSoTYdBtWYA1bwNv/gYn9yNs1DUFh44yPzahANS47xaFLIv3K98yMZFAnhOgppPOT1HZPZg9iq2+yo9JDffhlEiCC+bL9sIxm7ogvcvdQKjIV/Lao81gHHqWZSdeRMwmNoK1OTnxLi6cq14ZVu7Sdete8/+sBZSan1cLg2QrJeDDQ9LmA6/9brugXZTQuO/XQTTGMbmQdVDgff7RH6WpDxsjbmtud4xSKXM9H6QawyVD+/9KpgVgPmhzkbOD446GHXi9J/jKRHoBK6uhdzx5fYbJyJOPZie6/Qe84RS2hcUcZLTyFODY28YIkPAeQPPuhWspfzqEcVoBSkGQWhHvnixiNUinL026AxzuQvERvB3AwlB98DrHHUo8/zNKcTTOu40K3IzTLFb/xM2Db96ZXDfxc7rvGJR71wd8vRxVM1GAPBiVWQr5bqPjKU1ukWsGiU9fdgVfmzi2I9tQ5+vyWvyt03iobTn48YzE3P+yt+wIYwP9K4SUh7bMalEJtWrSGaNpF5TtksQ2Fyvisx174IuYajcSQU1JgqwwTTZv+HfoSS/B7vSOWcLS13iX9nfr+0mgy8/lK6yuLmU8m5yFZKoTRBktnLrn0N0gbc6smDSvdk4EiuxCuD2xnd6uxLI004j6XbK5kmwr4/kpocr17PETPxmEXzlKOWd0Mk9fa06+n/fLnqupuRO5V3hwz+KKgihNCzDOciPgUBXWWX4eG826fi/U39TrVKlPo2kqzBvZ/AIMQUvKi5Aw5P3ok57xa/CDGm384ZnHkGN19FewiIj5XyO1HSOvz3E+qwg0m2nDdw3f/KJqfhDYnVLnjfjLfheq5w61VAyAUp19Bf/49UBqgcL/3tRE/+qtjYAwQ9Mwy12ObnFM592pkRWWKBBa+ZKqzSvkGS6uIuJemMfo4VtyznJ/gr/f69xUM+zEQBdLOxwN4bZW61GtxiIl6FejzjjNAbHmpMvwwkTeWshkfGGRRJZn2/F4076TU6DsryJ53CTY3rr7uO2Ca9NpxQfGFPTF6WRyTGbYKBDt4JYSSTauZllToNR7npH4Yk3Bev8gb+zLxuPOv8+SQFro6l2dRoofbaw0JhvQdoBtiQhjneB1p/mXkFEnpvhPgm8B5Nj2yb2K09uMR2UNla5vdHSAA/bMKOzOVnEMy0Vw7cOMyAGg//HoeWRAopR9d0+OcgJl5nEoz+3BPKwjG2RbIZivqQDc5sq/+O7z3KP894jk3++uwhgl0hGJYdMGSnguty+1GWCdKW1UqekKmueOIX2yufrGo=
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Reserving the entire node doesn't solve the issue with the current system. I cannot disable the cgroups of resources. I disabled hwloc but the error now from pthreads:

starpu_pthread_t self = starpu_pthread_self();
res = pthread_setaffinity_np(self, sizeof(aff_mask), &aff_mask);
if (res)
 {
                const char *msg = strerror(res);
                _STARPU_MSG("pthread_setaffinity_np: %s\n", msg);
                STARPU_ABORT(); << fails here
}

any idea how to solve it?

On Wed, Mar 27, 2019 at 12:38 AM Amani Alonazi <amani.alonazi@kaust.edu.sa> wrote:
Sure that’s a better easy solution.
Many thanks! 
A.

On 27 Mar 2019, at 12:30 AM, Brice Goglin <Brice.Goglin@inria.fr> wrote:

I don't know, I work on the hwloc side, I don't know how StarPU should be fixed for this case.

For now, you may avoid the issue by reserving entire nodes.

Brice


Le 26/03/2019 à 22:20, Amani Alonazi a écrit :
Hi!

Yes, there are CPU-less NUMA nodes because of cgroups. Will configuring starpu without hwloc solve the issue?

Many thanks!

On Tue, Mar 26, 2019 at 11:53 PM Brice Goglin <Brice.Goglin@inria.fr> wrote:

Hello

What does lstopo say on this machine? (inside the same reservation)

Do you have CPU-less NUMA nodes? (maybe because of cgroups)

That's the only idea that came to my mind when looking at the code.

Brice



Le 26/03/2019 à 21:46, Amani Alonazi a écrit :
Dear Samuel and Starpu-dev,

I am facing an error at the initilization step of StarPU. The error is: core/perfmodel/perfmodel_bus.c:598: find_cpu_from_numa_node: Assertion `current' failed.

raise () from /lib64/libc.so.6
abort () from /lib64/libc.so.6
in __assert_fail_base () from /lib64/libc.so.6
in __assert_fail () from /lib64/libc.so.6
in find_cpu_from_numa_node (obj=0x49a5b7f0) at core/perfmodel/perfmodel_bus.c:598
in measure_bandwidth_between_numa_nodes_and_dev (dev=0, dev_timing_per_numanode=0x200000357338 <cudadev_timing_per_numa>, type=0x20000030cf50 "CUDA") at core/perfmodel/perfmodel_bus.c:633
in measure_bandwidth_between_host_and_dev (dev=0, dev_timing_per_numa=0x200000357338 <cudadev_timing_per_numa>, type=0x20000030cf50 "CUDA") at core/perfmodel/perfmodel_bus.c:654
in benchmark_all_gpu_devices () at core/perfmodel/perfmodel_bus.c:784
in generate_bus_affinity_file () at core/perfmodel/perfmodel_bus.c:1029
in _starpu_bus_force_sampling () at core/perfmodel/perfmodel_bus.c:2931
in check_bus_config_file () at core/perfmodel/perfmodel_bus.c:2036
in _starpu_load_bus_performance_files () at core/perfmodel/perfmodel_bus.c:2959
in starpu_initialize (user_conf=0x490f02b0, argc=0x0, argv=0x0) at core/workers.c:1400
in starpu_init (user_conf=0x490f02b0) at core/workers.c:1213

How can I solve it? Is it connected with hwloc?

Many thanks,

--
Amani


This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.
_______________________________________________
Starpu-devel mailing list
Starpu-devel@lists.gforge.inria.fr
https://lists.gforge.inria.fr/mailman/listinfo/starpu-devel



This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.


--
Amani


This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.


Archives gérées par MHonArc 2.6.19+.

Haut de le page