Accéder au contenu.
Menu Sympa

starpu-devel - [starpu-devel] [wrong binding, measuring execution/sleep time, incorrect timing results]

Objet : Developers list for StarPU

Archives de la liste

[starpu-devel] [wrong binding, measuring execution/sleep time, incorrect timing results]


Chronologique Discussions 
  • From: Maxim Abalenkov <maxim.abalenkov@gmail.com>
  • To: starpu-devel@inria.fr
  • Subject: [starpu-devel] [wrong binding, measuring execution/sleep time, incorrect timing results]
  • Date: Wed, 4 Sep 2024 14:37:44 +0100
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=maxim.abalenkov@gmail.com; spf=Pass smtp.mailfrom=maxim.abalenkov@gmail.com; spf=None smtp.helo=postmaster@mail-wm1-f43.google.com
  • Ironport-data: A9a23:brPava1oLLD2/cxUQ/bD5VZ6kn2cJEfYwER7XKvMYbSIYAyW5UV6n zlKDzDab88+URL3ctAkPYvg9EMEucCAyoRnGgU++y8xQiNEpcGbXd7IIkqtM3nIdZCTEB4/s MtHM9edJ809FXHXqxqnPrLr9yNxvU3kqsIQLcadUswmbVM4FU/N8C5eptLV62IGbbKRBh+Xq Zb4p86aKEe4nTUc3gk8tqnb9h804P6t4mpIswc1PqoU5lbUzCkcAJ5Ge6i6dyKmH9MFQbfnH LmewOnloDuArxt1VY3/zb+keRBWHuWCYmBi5pYut42K23Cu8QRri/ZiZZLwEHtqtghlv+yd6 f0RvsW6RFx0M/Ocke0UWEgGHn9zMaMb8+XML3Sy4MbLnmTLIiDmqxlMJBFvbNdAoo6bIkkXq KdGcGhVBvynr7jrqF5uYrA03qzPFOGyYMVF/CsIIQjxVZ4OWYrET7jB+ehW1TIxgtEmNfvFb qL1UxI2BPj7S0MJYw9/5K4Wxr/y2iGuK2EA8jp5mIJui4Tt5FwpuFTSGIGNEjC6bZ09tlqVo GvA41P4DnkyXDBI4WftHtqE34cjrAujMG4gPOXQGs1C3DV/8lcu5Cg+DjNXlxUWZnmWALqzI 2RMksYnQDNbGEaDFrERVDXgyJKIU4J1t3O93ITW5SnUopc47Tp1CUAeUxxqZto3s/Q6SGEbi UewsfmxXQJg5ej9pXK1rt94rBu3MCkRaHAHPGoKEVBD7N7kr4U+yBnIS76PEobv1oyzSWy2m mra6nRl71kQpZZjO6GT+FnXgjfqr5LASgMz5QLJdm2g5wJ9IoWiYuRE7HCKsKobfNbEHjFtu lBdmsKFse8rJq3djQ2kTvxTRo2VoNKKZWi0bVlHRMR4r2v8pRZPZ7t4+y13Pl90d8oJZzLtS FTCvBtYop5VJnqjK6FtC79dEOwvxKnkUMvgD7XaMoEIbZ92ewuKuippYCZ8wlwBjmA2jIMUN 5GKdv2gACsrVrxe4hDuH9kShOpDKj8F+UvfQpXyzhKC2LWYZWKIRbptDLdoRrBphE9jiFWFm +uzJ/e3JwNjvPoSixQ7HKYWJFEOaGc+XNX495cReemELQ5rXmomDpc9II/NmaQ0xsy5dc+Ro RlRv3O0LnKh3xUrzi3UMBhehEvHB8oXkJ7CFXVE0ayU83Yie52zy6wUaoE6e7IqnMQ6kqQpF KlVIZTfW64fItgix9j7RcmtxGCFXET77T9iwwL/OVDTgrY5GFyZpIC+JmMDCgFUVnPv6ppWT 0KcOvPzGsdaH1szUq46mdqgyFS+uXVVmeR5GSP1zip7KS3RHHxRA3Wp1JcfeplSQT2an2fy/ 1jMXX8w+7KWy6drq4ahuExxh9z1eweINhEHRzWzAHffHXWyw1dPNqcaAL3XIGqECz6cFWfLT bw98swQ+cYvxD5i27eQ2Z4ypU7nz4K3/eEI/RcuB3jRcVWgB5VpJ3TMj4EFtbRAyvUd8UG6U 16GsIsSc7iYGtLXIHhILiocb8OHya41nBvW5q8LO0nU3nJ804eGdkRwBCOyrhJhAoF7C64b5 NswmdU37lW/gyU6M9zdgSFz8X+NH0M6UK4ml887BdbrgzU01l0YOYz4NR6uxYDSbd8WY08gD QKJtfCTm5Vd2UvwXH4hHlfd3edmpMovuTIb6HQgNli2it7+qftv5yJo8BMzVRZw4iRc9uBOZ lhQKEx+IJuR8wdShMRsW36mHydDDka7/nPd5kQolmqDaWWVTU3IcXMAPNiS8HAj825zeiZR+ Jeawj3HVRfoZMTA4Tsga3V6qvDMTc1DySOaoZqJR/+6JpgdZSbppoSMZmBS8hvuPp4XtX39/ OJv+L59VL3/OSsuuJYEMoi914kLaRW6NWdHEOBA/qQIIDnmQwuM+wOycmK/RsAcAMbx0x6cK 9dvLcdxRRiBxH6wjjQEN5UtfZ5wvtAUveQnRJ26C1ICgbWlqhhRjKnx7Qn73W8ifMVvm50yK 6TXbDOzLVaTjnp1xU7IoNV1BW6jRdwiegfH/fuU9d8RHMkpq9BcckAV05q1sU6KMQBhwQmmg QPba4LSzM1g0Y5JnbawNolmGCOPNorVeMmT1QK8ofBiTInqCtjfkRERpn3MHRVkDZFIV/tZz b2y4cPKhmXbt7MIYkXlspimFYwSwO6tXeBSY/nFHFMDkQSsAMbTsgY+oUamIpl0kfRY1MmtZ y28TOCSLdc1edNs9Ed5WhhkMSQ2KvrIN//7hCaHsf6zJAAX0lXHIPOZ5HbZVzxnWRFSCaLuK D3fmqiI3c9ZnrRuFRVfJvBBAr1EGnHBd5YiVeXMsWi/MjH1rHKE4qDvhDgx2wHtU3OkKvv30 bjBZxr5dSmxhp328cFkg9RykyATXVlAgrgWX0MC+tRJpSiwI0wYIM88b5gXKJFmvRbj9ZP/Z QOXNWsrNjrgbG4VbTT9/9XRcQONDcMeOtrCB2II/mHFTwyUFY+/EL9a2SM42EhPexzn1/CBF d4F32/ZZzyd/8lMfvkCwdCemsJl9+P+6lNT3n6liO30IRIVIYtS5UxbBAAXCBD2SZDcpnvEN U0eZD5hUkqkbWXTDMw5WXpeOC9BjQPV1z9yMBu+mofOib66kt9F5ub0Ydzo87s5a88PGr4Ca FX3S0aJ4EGUwnYjgrQoif14nZ5LDe+3IebiIJ/BXQEym4SC2lYjNe4GngsNS5gG0yxbGFX/i DKtwiYfAGKoFUNv44CVmD45o89JbnExDj/31V+14XeMlBEi1NHWdiS71A+xe9m6t6HnuF4eW zsIKlqYp1qNrjb/uD1irbIhq0eaBd0KX2zxOszyokgeTj/3IIOcKFxg763+/9dY8XkB2YANM ajNiJMy/4WoWi7J1Afh9Dvcjn6LsJ9ZcjnvSZewlP6oKjd0/Iz+ete9bL29S27+RDloDE8Z3 JS7h0P3UGrrXMo4YtChk8LUQp14rL+/0trULbn54d9y4teh73TUp3J0DcRCtuibuWVtmLtvt Hns3Ms=
  • Ironport-hdrordr: A9a23:BpWTlq9a7CgbIe4nxbBuk+DjI+orL9Y04lQ7vn2ZKCYlEPBw8v rFoB11726TtN98YgBCpTn/AtjmfZqsz+8R3WB5B97LN2nbUQOTTb2KhrGSpwEIdReOj9K1Fp 0NT0G9MrDN5JRB4voSmDPIa+rICePozJyV
  • Ironport-phdr: A9a23:E6BS2x3/BwMuYmJ9smDO/w0yDhhOgF0UFjAc5pdvsb9SaKPrp82kY BeHo6ww0xSXA83y0LFttan/i+PaZSQ4+5GPsXQPItRndiQuroEopTEmG9OPEkbhLfTnPGQQF cVGU0J5rTngaRAGUMnxaEfPrXKs8DUcBgvwNRZvJuTyB4Xek9m72/q99pDdfwlEniexbLNwI Rm5rAjct9QdjJd/JKo21hbGrXxEdvhMy29vOVydgQv36N2q/J5k/SRQuvYh+NBFXK7nYak2T qFWASo/PWwt68LlqRfMTQ2U5nsBSWoWiQZHAxLE7B7hQJj8tDbxu/dn1ymbOc32Sq00WSin4 qx2RhLklDsLOjgk+2zRl8d+jr9UoAi5qhJ/34Hbb5ybOvlwcK3Det0XXnBOUtpUVyFbAoOwc 4kCAuwcNuhYtYn9oF4OoAO5Cwa2C+LvyzpIjWLq0KIhyeshFR/J3AojH9IJrnTfsdL4NKIIX uG6zanIyC/PYOlN1jjn7IjHbBYhofKLXbJuasfRxkwvGBnEjlWUs4DqIzSV1uEUvmWd8uFvW v6hhXQ9pAFtvjig2N0sio/Ri48R11zK+yp3zog6K9C6SkN3fcKoHYVOuyyUOYV6XswsT391t Ssm17ELpJy2cSwFxZk52hLSdvKKf5WI7B7+SeucJypzinxieLK6nRmy8E6gx/XmWcavyllKq jRKkt/PtnAKyRzf8NWHSvh780y82jiPzxje5v9YLU0wj6bWKJ4szqQtmpYNsknPBCD7lUvug KOIbEoo5van5uH6brr7qZKRNYp5hRzgPqgylMGyBPg3PRQSU2SH/Omx2rnu8E35TbhPkPE6j rfVvZ/fKMkYoqO0DBNa34M+5xqhETipzdIVlmQZIVJBZRKHlJTpNE/UIP7lDPe+gkqjnS93y /3AI7bvGI/CLmLZn7fkZbt961BTyA40zd1H4pJbELABIPbqVk7/sdzUEwY1MwK7zuvpEtl92 YQeWWWAAq+dLqzeq0OH5uUqI+WUZY8VvijyK+Q96vLwkXM0nUURcKqp0JcNdXy0APdrL1+Wb HfimtsBFH0Fvgs6TOzkkl2CVjtTam63X60m6TA7CYemAZ3ZSY+2mrOB2Dm0HptVZmBBFFCBC njod4CeVPcNbCKeONNukjsBVbS5TY8uzgmhtBXmxLp/MurU5ioYuIr+2NRt/e3ciQky9SBoD 8Say2yNVH10kXkSRz8uwaBzuFFyxUmd0admh/xVDsdT5vNMUgciL5HQ1e16C9boWgLAZNiFU limQs/1SQ02G8ktysUWfgNxFsuvijjC2TCrCvkbje+lHpsxp4fVz3n0b+93wnLL0qomkRFyR 8JfNGjgj6R28wXVBInXu0qcnqeuM68b2Xiepy+40WOSsRQAA0ZLWqLfUCVHDqO3hdHw50eZC qSrFaxiKQxZj8iLNqpNbNTty1RAXvbqftrEMCqqg2nlIxGOy/uXaZbyPX0H1XDYAVgFlkYa9 3+COA49ATmJrGfXDTgoHlXqMAv36ecrkHqgVQcvyh2SKUho1r674BkQ0PmVW/IWmLsOsSMsr TZzAn6y2tvXD5yLoA8yNL5Eb4Yb51FKnXncqxQ7PpGkKPV6gUUCdg1so070/xB+C4EFjs1z6 X12l0x9LqWX1F4HfDSdtXzpEpvQLGS6vBWmaqqNn0rbzM7T4aAXrvIxt1TkugitUEsk6XRul ddPgTOa4d3RAQweXIiUMA5//gVmp7zcfig25p/FnXxqP66utzbe2tUvTOI7wxekdt1bPeuKD gj3W8EdAsGvLqQtlT3LJloBNfBT8eg9Ncqiev2F37CDM+NpnTbghmNCoch83k+K6ytgW7vQx Z9WppPQlgCDVjr6kBKgqpWtwdECNWxURzDvj3S/V+szLuVocI0GCHmjOZiyz9R63Nv2XmJAs UWkHxUA0dOofhybaxr82xdR3AIZuy/C+2PwwjpqnjUutqfa0jbJxrGocRMXO2sNTWNriVrqI o+qp98fVUmsKQMukVH2gCSyj7gev6l5I2TJFA1Ndjj3KidrWKK0u7yBZNBn55YhsCERW+O5K wP/KPa1s14R1CXtGHFbzTYwemSxu5n3qBd9jXqUMHd5qHexldhY/R7E/5ScQPdQ2mFDXyxkk XzMAUD6OdC1/NKSnpOFs+akVmvnWIcBOSXsyIqBsmO86wgISVW6luCyn5vuGwE+3Cn40MdCW iDBrRK6aY7un6i3KuNoeEB0CUS0sZIrXNEj1NJq1NdMgjATnfD3tTIfnH32MMlH1K62d3cLS TMRgpbU7AXjxEx/PyeMzoP9WG+ax5gpbN27b2UKnyMlupoSWeHEsfodx3sz/gHryGCZKeJwl Toc1/Y0vXsTguVS/REo0j3YGbcKW09RIS3rkR2MqdG4tqReImi1It3SnAJzm86sCLaar0RSQ nH8L90nFD9x64N2OlvI3Xvw7Z/Md9zZbNZVvRqR2USl7aAdONcqm/wGiDAyc2f0rXwgjecxh Bhj25K9oqCILmxs+OSyBRsSZViXL4sDvzrqi6hZhMOf2YuiS45gFjs8V5ztVfu0ETgWuKevJ 0OUHTY7sHveBavHEFrV9hJ9t3yWWcPOVTnfNDwDwN5lXhXYOEFPnFVeQmAhhpBgXgGymJ66L QEgt2hXvAKn7EMLkL4gNgGjADmD4l3zMXFtFsDZdF0Pv2Qgrw/UKZDMsLw1RnkCuMXn9EvXc iSaf1gaUz9PABDVQQC7eOHpv4GI8vDEVLXkaaKSJ+zf86oGEK7YoPDnmop+o2TTaoPWZCQkV 7tjnRMdFXFhR5aAw2VJEnNI0XKLN4nB/V+94nEl95/ktq26BES3o9PIUuU3U50n+gjq0/3bZ qjA2WAgc2YejtRVmjfJ0ORNhgdMzXw+MWD8S/JY8necBKPIxv0NVkBdMXgicpASqfp7h1goW 4aTnNrx0vQQYucdLVBDWBShn8ioYZdPOGShLBbcA17NMr2aJDrNysWxYKWmSLQWgv8G/xu38 S2WFUPuJFHh33HgSgyvPOdQjSqaIA0WuYezdQxoAHTiS9SuYwOyMdt+hzk7ibMugXaCOWkZO Dl6O0RDy9/YpTtfmel6Endd42BNKOCFn2OG6rCdJMpN9/RsBStwmqRR53F7g7pZ4SdYReBkz Svfqtk9xjPu2uKLyzdhTF9PsmMR3NPN7RgkYP+JsMUbChOmtFoX4G6dCgoHvY5gA9zr4OVLz 8TX0bn0MHFE+s7V+s0VA47VLtiGOTwvK0mMenacAQ0bQDqsLWyaiVZalaTY83SFr518o5Lln JcISbhFfFMwH/IeTE9iGZZRRfU/FiNhirOdgMMSsDCmqwLNQcxBop3dfvebAPGqNzPAyLcYP F0HxrT3KYlVPYr+kR8HCBEyjMHBHEzeWspIqytqY1ovoUlDx3N5S3U6x0Pvbg7FCJ47EPe1m lspj1I7b7h2sjjr5FgzKxzBoy5iyCHZfP3qhDmQdHj6K6LiBem+5AL7skEwNtXwRAMnNGWP
  • Ironport-sdr: 66d862b6_f2hxU5euNuQ6AEWr4X7o0otBMnfJ3XUcfO2Q+ZtxoFqsJ3S mfNSvwHhNsHueC7X+UJQpuY7oZhtkv1C/tdKP7w==

Dear all,

How are you? I hope all is well with you. I need help please.

I wrote a simple program to test StarPU. The program takes a matrix and raises each matrix element to the power of two. The matrix of size (m x n) is processed in blocks of size (p x n). You may find the full source code of my example attached. I’m running my code on a MacBook Air M2 with 16 GB of RAM. I compile my code with

clang -I /opt/local/include/starpu/1.4 -I /opt/local/include mtrx-blk-xpu.c -o mtrx-blk-xpu -L /opt/local/lib -lstarpu-1.4

It is Apple clang version 15.0.0 (clang-1500.1.0.2.5).

Please find below a snippet that I adopted from one of StarPU examples to display profiling information:

// display profiling information
for (int worker = 0; worker < starpu_worker_get_count(); worker++) {

        struct starpu_profiling_worker_info worker_info;
        status = starpu_profiling_worker_get_info(worker, &worker_info);
        STARPU_ASSERT(!status);

        // times
        double t_total          = starpu_timing_timespec_to_us(&worker_info.total_time);
        double t_execution = starpu_timing_timespec_to_us(&worker_info.executing_time);
        double t_sleep         = starpu_timing_timespec_to_us(&worker_info.sleeping_time);
        double t_overhead  = t_total - t_execution - t_sleep;

        // ratios
        float r_execution = 100.0*t_execution/t_total;
        float r_sleep          = 100.0*t_sleep/t_total;
        float r_overhead  = 100.0 - r_execution - r_sleep;

        char workername[128];
        starpu_worker_get_name(worker, workername, 128);
        printf("Worker %s:\n", workername);
        printf("\t%d task(s)\n", worker_info.executed_tasks);
        printf("\ttotal time: %.2lf s\n", t_total*1e-6);
        printf("\texecution time: %.2lf s (%.2lf %%)\n", t_execution*1e-6, r_execution);
        printf("\tsleep time: %.2lf s (%.2lf %%)\n", t_sleep*1e-6, r_sleep);
        printf("\toverhead time: %.2lf s (%.2lf %%)\n", t_overhead*1e-6, r_overhead);
}

My questions are

1) I see a series of warnings when I run my code:

[starpu][lobachevsky][_starpu_init_topology] Warning: there are several kinds of CPU on this system. For now StarPU assumes all CPU are equal
[starpu][lobachevsky][_starpu_init_topology] Warning: could not get current CPU binding: Function not implemented
[starpu][lobachevsky][initialize_lws_policy] Warning: you are running the default lws scheduler, which is not a very smart scheduler, while the system has GPUs or several memory nodes. Make sure to read the StarPU documentation about adding performance models in order to be able to use the dmda or dmdas scheduler instead.
[starpu][lobachevsky][_starpu_bind_thread_on_cpu] [lobachevsky] Warning: worker 0 was already bound to PU 0
[starpu][lobachevsky][_starpu_bind_thread_on_cpu] and we were told to also bind worker 8 to it.
[starpu][lobachevsky][_starpu_bind_thread_on_cpu] This will strongly degrade performance.
[starpu][lobachevsky][_starpu_bind_thread_on_cpu] [lobachevsky] Maybe check starpu_machine_display's output to determine what wrong binding happened. Hwloc reported a total of 8 cores and 8 threads, and to use 8 threads from logical 0, perhaps there is misdetection between hwloc, the kernel and the BIOS, or an administrative allocation issue from e.g. the job scheduler? You may want to try to use export STARPU_WORKERS_GETBIND=0 to ignore the job scheduler binding

How do I avoid wrong binding? For now, I would like to omit the OpenCL 0 worker from doing any calculation.

Please find below my timing results:

Worker OpenCL 0 (Apple M2 9.6 GiB):
0 task(s)
total time: 6.39 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 6.39 s (100.00 %)
Worker CPU 0:
119 task(s)
total time: 6.39 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 6.39 s (100.00 %)
Worker CPU 1:
139 task(s)
total time: 6.39 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 6.39 s (100.00 %)
Worker CPU 2:
109 task(s)
total time: 6.39 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 6.39 s (100.00 %)
Worker CPU 3:
124 task(s)
total time: 6.39 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 6.39 s (100.00 %)
Worker CPU 4:
132 task(s)
total time: 6.39 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 6.39 s (100.00 %)
Worker CPU 5:
120 task(s)
total time: 6.48 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 6.48 s (100.00 %)
Worker CPU 6:
120 task(s)
total time: 6.48 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 6.48 s (100.00 %)
Worker CPU 7:
75 task(s)
total time: 6.48 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 6.48 s (100.00 %)

2) My execution time and sleep time on all workers are 0.00 s. How do I measure these times correctly?

Please find below an output of ‘starpu_machine_display’:

[starpu][lobachevsky][_starpu_init_topology] Warning: there are several kinds of CPU on this system. For now StarPU assumes all CPU are equal
[starpu][lobachevsky][_starpu_init_topology] Warning: could not get current CPU binding: Function not implemented
[starpu][lobachevsky][initialize_lws_policy] Warning: you are running the default lws scheduler, which is not a very smart scheduler, while the system has GPUs or several memory nodes. Make sure to read the StarPU documentation about adding performance models in order to be able to use the dmda or dmdas scheduler instead.
[starpu][lobachevsky][_starpu_bind_thread_on_cpu] [lobachevsky] Warning: worker 0 was already bound to PU 0
[starpu][lobachevsky][_starpu_bind_thread_on_cpu] and we were told to also bind worker 8 to it.
[starpu][lobachevsky][_starpu_bind_thread_on_cpu] This will strongly degrade performance.
[starpu][lobachevsky][_starpu_bind_thread_on_cpu] [lobachevsky] Maybe check starpu_machine_display's output to determine what wrong binding happened. Hwloc reported a total of 8 cores and 8 threads, and to use 8 threads from logical 0, perhaps there is misdetection between hwloc, the kernel and the BIOS, or an administrative allocation issue from e.g. the job scheduler? You may want to try to use export STARPU_WORKERS_GETBIND=0 to ignore the job scheduler binding
Real hostname: lobachevsky (StarPU hostname: lobachevsky)
Environment variables
STARPU_NCPU=8

StarPU has found :
8 CPU workers:
CPU 0
CPU 1
CPU 2
CPU 3
CPU 4
CPU 5
CPU 6
CPU 7
No CUDA worker
1 OpenCL worker:
OpenCL 0 (Apple M2 9.6 GiB)
No FPGA worker
No MPI_MS worker
No TCPIP_MS worker
No HIP worker

topology ... (hwloc logical indexes)
numa  0 pack  0 core 0 PU 0 OpenCL 0 (Apple M2 9.6 GiB) CPU 7
core 1 PU 1 CPU 0
core 2 PU 2 CPU 1
core 3 PU 3 CPU 2
core 4 PU 4 CPU 3
core 5 PU 5 CPU 4
core 6 PU 6 CPU 5
core 7 PU 7 CPU 6

bandwidth (MB/s) and latency (us)...
from/to NUMA 0 OpenCL 0
NUMA 0 0 21905
OpenCL 0 22473 0

NUMA 0 0 255
OpenCL 0 160 0

GPU NUMA in preference order (logical index), host-to-device, device-to-host
OpenCL0 0

3) I have two blocks in my ‘main’ program where I initialise StarPU, execute tasks and finalise StarPU. Before I finalise StarPU I print out profiling information onto the screen. However, the second StarPU block returns very high timing values. Please find them below. What do I need to do to have correct time measurements in a second StarPU block?

Worker OpenCL 0 (Apple M2 9.6 GiB):
0 task(s)
total time: 418806.03 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 418806.03 s (100.00 %)
Worker CPU 0:
121 task(s)
total time: 418806.03 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 418806.03 s (100.00 %)
Worker CPU 1:
124 task(s)
total time: 418806.03 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 418806.03 s (100.00 %)
Worker CPU 2:
127 task(s)
total time: 418806.06 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 418806.06 s (100.00 %)
Worker CPU 3:
137 task(s)
total time: 418806.06 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 418806.06 s (100.00 %)
Worker CPU 4:
107 task(s)
total time: 418806.06 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 418806.06 s (100.00 %)
Worker CPU 5:
142 task(s)
total time: 418806.06 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 418806.06 s (100.00 %)
Worker CPU 6:
104 task(s)
total time: 418806.06 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 418806.06 s (100.00 %)
Worker CPU 7:
76 task(s)
total time: 418806.06 s
execution time: 0.00 s (0.00 %)
sleep time: 0.00 s (0.00 %)
overhead time: 418806.06 s (100.00 %)

Thank you for your help and have a great day ahead!

Best wishes,
Maxim

Attachment: Makefile
Description: Binary data

Attachment: mtrx-blk-xpu.c
Description: Binary data


Maxim Abalenkov \\ maxim.abalenkov@gmail.com
+44 7 486 486 505 \\ www.maxim.abalenkov.uk


  • [starpu-devel] [wrong binding, measuring execution/sleep time, incorrect timing results], Maxim Abalenkov, 04/09/2024

Archives gérées par MHonArc 2.6.19+.

Haut de le page