Objet : Developers list for StarPU
Archives de la liste
- From: Mirko Myllykoski <mirkom@cs.umu.se>
- To: samuel.thibault@inria.fr
- Cc: Starpu Devel <starpu-devel@lists.gforge.inria.fr>
- Subject: Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4
- Date: Wed, 03 Oct 2018 19:56:56 +0200
- Authentication-results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=mirkom@cs.umu.se; spf=Pass smtp.mailfrom=mirkom@cs.umu.se; spf=None smtp.helo=postmaster@mail.cs.umu.se
- Ironport-phdr: 9a23:WGWOdxLjJ5QDxcsFNdmcpTZWNBhigK39O0sv0rFitYgeKP3xwZ3uMQTl6Ol3ixeRBMOHs60C07KempujcFRI2YyGvnEGfc4EfD4+ouJSoTYdBtWYA1bwNv/gYn9yNs1DUFh44yPzahANS47xaFLIv3K98yMZFAnhOgppPOT1HZPZg9iq2+yo9JDffwdFiCChbb9uMR67sRjfus4KjIV4N60/0AHJonxGe+RXwWNnO1eelAvi68mz4ZBu7T1et+ou+MBcX6r6eb84TaFDAzQ9L281/szrugLdQgaJ+3ART38ZkhtMAwjC8RH6QpL8uTb0u+ZhxCWXO9D9QKsqUjq+8ahkVB7oiD8GNzEn9mHXltdwh79frB64uhBz35LYbISTOfFjfK3SYMkaSHJOUclNWCJPDIOyYZUSAeUDM+ZWrIb8qFUVoBuiBwSsBv/jxiNSi3Po26AxzuQvERvB3AwlB98At3XUrM/2NKcVTOu7y6nJzTHHb/JW2jf975PIchMgofqRWr9wdMvRyVMsFwPCi1WdspDqPyiP2uQQtmib8vBsVfmxhGM+rQx6vzahxsApiobTh4IVzEjJ9Sp+wIYyKt24TFB0bcS4H5tXsiGXMZZ9TMA6Q2xwpSo3xKEKtYSlcCUEzJkr3RHSZ+Cdf4SW4h/vTPidLStkiH54fb+yhg29/VSlx+LiU8S530pGoyVZntTJs30A1h/e586aQfVn5EihwyyA1wXL5+FEP080ka3bJoYkwrEql5oTtV7PETPtmEnska+Wc1gk9vKz6+v5ernmp5mcOJFoigzmL6gjlNCzDf4mPgUAW2WX4+ux2KH58UHnQLhGlvg2nbPYsJDeK8QbvKm5AwpN34Ym6ha/FTSm0NMdnXQcMF1FYwiLj5XsO1HTOvz4Fumwj0y2kDh33/DGIqHhApLVI3jYjLfuZ6x961ZByAYq0N9Q+YxUBa8FIP/oXk/xtcfYDgMiMwCuzObnBtJ91pkRWW2RGKOZMaXSsUWJ5u01OeWMapUV637BLK0+7vvzgHt/hV8cd6C02Yc/aXaiH/0gLV/KT2Drh4IkEH0JuUIbXejumV6FSj1SLyK3Xrg/4TQTA5ngEILeAJug1u/SlBynF4FbMzgVQmuHFm3lIsDdA69VOXCiZ/R5mzlBboCPDooo1BWgrgj/kuM1JfGS5ygF84nuhoEsu7/j0Coq/DkxNPyzlnmXRjgtzGgTASIzweZkrB4lkwrR4e1Dm/VdUOdrybZJXwM9bMCOyuV7D5b5QUTce8rPU1v0Gtg=
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
On 2018-10-02 22:35, Mirko Myllykoski wrote:
On 2018-10-02 22:05, Mirko Myllykoski wrote:
On 2018-10-01 15:59, Samuel Thibault wrote:
Mirko Myllykoski, le jeu. 27 sept. 2018 20:21:14 +0200, a ecrit:
On 2018-09-27 18:32, Samuel Thibault wrote:
> I'm afraid the issue you are getting is that main memory registration
> for CUDA DMA transfers is quite costly, and actually not worth the
> cost for the asynchronism benefit. Setting --disable-cuda-memcpy-peer
> by hand allows to disable asynchronism optimization and thus the data
> registration. Ideally registration would be cheap, or cached, to avoid
> the issue. If your temporary data pieces are small (less than 4MB), you
> could try the attached patch which will make StarPU use a suballocator
> not only for the CUDA memory but also for the main memory, which we
> didn't feel would be useful, but perhaps would be.
Compiling StarPU 1.2.4 with this patch (without --disable-cuda-memcpy-peer)
fixes the problem.
Ok, that's the best option: you keep memcpy-peer which avoids spurious
synchronization, but avoid paying the corresponding cost all the
time. I have commited a more elaborated version of the the patch to all
branches.
Thanks!
Samuel
Hi,
I believe that the patch did not fix the problem completely. I am now
experiencing a similar problem with a different code. Again, the
performance declines when I move from StarPU 1.2.3 to 1.2.4. Compiling
StarPU with --disable-cuda-memcpy-peer fixes the problem but this time
the patch seems to do nothing (I tried the one you sent me earlier and
the commit 8c35781b0ac517304ac2f2461b243b4447c38ab3).
The code uses two scheduling contexts:
A) Uses peager or pheft. In this case pheft is used and the context
contains a single GPU (the context can also contain some CPU workers).
B) Uses prio or dmdasd. In this case dmdasd is used and the context
contains five CPU workers.
The code has three task types: process_panel, update_trail and update
right. The first two are inserted to A and the last one is inserted B.
When compiled without --disable-cuda-memcpy-peer, the workers that
belong to B report a lot more overhead. This does not happen when
compiled with --disable-cuda-memcpy-peer. The code also runs slower
when compiled without --disable-cuda-memcpy-peer.
Fxt trace with --disable-cuda-memcpy-peer:
https://drive.google.com/open?id=1DuzEvPVSzB9lc1IjLlN44gjcw-mc7oN5
Fxt trace without --disable-cuda-memcpy-peer:
https://drive.google.com/open?id=1oMTqXb4kYLQifbFSF2w5BU0Ivuz8kzHF
- Mirko
Hi,
The previous email contained an incorrect link. Here are the correct ones:
Fxt trace with --disable-cuda-memcpy-peer:
https://drive.google.com/open?id=1DuzEvPVSzB9lc1IjLlN44gjcw-mc7oN5
Fxt trace without --disable-cuda-memcpy-peer:
https://drive.google.com/open?id=1fwakuRneziUVwJZ8saeW3riVigPPy6eJ
- Mirko
Hi,
After analyzing the traces more carefully, I noticed that "MEMMANAGER0" gets stuck to state "Freeing" for several hundreds of milliseconds. GDB backtraces indicate that many CPU workers threads spend a lot of time inside the cudaFreeHost function. My interpretation is that this gets recorded as "Freeing" by FxT.
It appears that each CPU worker manages to execute one update_right task and then something gets freed using the cudaFreeHost function. The only thing that might get freed at this point is a scratch buffer (STARPU_SCRATCH). If that is the case, why would StarPU free this scratch buffer using the cudaFreeHost function? It is not used in any GPU related computations.
- Mirko
- Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4, Samuel Thibault, 01/10/2018
- Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4, Mirko Myllykoski, 02/10/2018
- Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4, Mirko Myllykoski, 02/10/2018
- Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4, Mirko Myllykoski, 03/10/2018
- Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4, Samuel Thibault, 03/10/2018
- Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4, Mirko Myllykoski, 03/10/2018
- Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4, Mirko Myllykoski, 02/10/2018
- Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4, Mirko Myllykoski, 02/10/2018
Archives gérées par MHonArc 2.6.19+.