Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4


Chronologique Discussions 
  • From: Mirko Myllykoski <mirkom@cs.umu.se>
  • To: Mirko Myllykoski <mirkom@cs.umu.se>, Starpu Devel <starpu-devel@lists.gforge.inria.fr>
  • Subject: Re: [Starpu-devel] Performance decline from StarPU 1.2.3 to 1.2.4
  • Date: Thu, 27 Sep 2018 20:21:14 +0200
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=mirkom@cs.umu.se; spf=Pass smtp.mailfrom=mirkom@cs.umu.se; spf=None smtp.helo=postmaster@mail.cs.umu.se
  • Ironport-phdr: 9a23:8f31IhVa5xlQo0kvBSpR/I80ORPV8LGtZVwlr6E/grcLSJyIuqrYbBCCt8tkgFKBZ4jH8fUM07OQ7/i/HzRYqb+681k6OKRWUBEEjchE1ycBO+WiTXPBEfjxciYhF95DXlI2t1uyMExSBdqsLwaK+i764jEdAAjwOhRoLerpBIHSk9631+ev8JHPfglEnjWwba9wIRmssQndqtQdjJd/JKo21hbHuGZDdf5MxWNvK1KTnhL86dm18ZV+7SleuO8v+tBZX6nicKs2UbJXDDI9M2Ao/8LrrgXMTRGO5nQHTGoblAdDDhXf4xH7WpfxtTb6tvZ41SKHM8D6Uaw4VDK/5KpwVhTmlDkIOCI48GHPi8x/kqRboA66pxdix4LYeZyZOOZicq/Ye94RWGhPUdtLVyFZDYyzc4QBAeQBM+hGsofypVUOoxixBAaiH+7v1j1Fhn3q0aEmyektDwfL1xEgEdIUt3TUqc34OqATUeCvz6nIyijIYfRW2Df57IjJchMhofaXXbJzcMra1E4iFwbEjlWKqozlODaV2/8RvGiA9eZgSOevi3M9pwFqvDei38EhgZTKiIIN0l3I6Ct0zJovKdGmRkN2ZcSoHZRMuy2AKYd7QtsuT390tCom17ELvJ+2czMWxJki2RHSbvOKf5SH7x7/UeucJDZ1iXFjdbminRi961Kgxff5VsSs0FZFsC5Fkt7Uu3AIzRPT68yHRuFh8Ue6wzqPygXT6vxeLUAvi6XXM58hwrgumZoPqUnPAyH7lFnsgKOIeEgp/vKk5uv7brn8uJORN5d4igTkPaQvnsy/D/44Mg8LX2WD/OS81afj/U7+QLVPlfA5jrLZvIrHJcQeu661GRRV3Zg75xalEzimyMgYnWUALF9dYxKHk5LmO0vWIPDlFPuwnkqjkCl1yPDCJbDhBpTNLmPfkLf6Z7p95EBcyA0pzdBQ+Z1YELABIPTpWk/wrtPUFBE5Mxbni9rgXfB8zIAZEUeeBquIM6TJsl7AsucuOeqLYaceo3DgLuVj/PO4yTcChVYHZbTh8ZYJZXTwSu96Kl+BfD/gj8kMFU8OvxEiV6rlhlqYXjMVZnCoXqt66CttW6y8CoKWY4G3g7vJ+Ta/GoxVYXtFCxjYFHbydIKBc/wXLj+XP4l6n2pXBvCaV4Y92ET250fBwL19I7+Ro3VA7MOx5J1O/+TW0CoK23lxBsWZ3XuKSjgtzGgTASIzweZkrB4kkwvR4e1Dm/VdUOdrybZRSA5jbMzX1KpnDs20QQ+TJo7UGmbjec2vBHQKdvx0w9IKZB8jSdCrjxSF1DHsHroI0aeGVsQ5
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hi,

On 2018-09-27 18:32, Samuel Thibault wrote:

Also you could try configuring with --disable-cuda-memcpy-peer.

Compiling StarPU 1.2.4 with --disable-cuda-memcpy-peer fixes the problem.

What does your data look like? I guess it is different allocation sizes?
How do you allocate them, with starpu_malloc? Do you use temporary
buffers?

The original data is two matrices that are stored continuously in the main memory. The matrices are partitioned into square tiles without copying. Long time ago, I experimented with the idea of copying the data so that each tile would be stored continuously in memory. I also tried to pin the tiles. This did not have any significant effect back then.

Some intermediate results get stored to temporary buffers (home node = -1) and certain task use scratch buffers (those tasks that have CUDA implementation in particular).

I'm afraid the issue you are getting is that main memory registration
for CUDA DMA transfers is quite costly, and actually not worth the
cost for the asynchronism benefit. Setting --disable-cuda-memcpy-peer
by hand allows to disable asynchronism optimization and thus the data
registration. Ideally registration would be cheap, or cached, to avoid
the issue. If your temporary data pieces are small (less than 4MB), you
could try the attached patch which will make StarPU use a suballocator
not only for the CUDA memory but also for the main memory, which we
didn't feel would be useful, but perhaps would be.

Compiling StarPU 1.2.4 with this patch (without --disable-cuda-memcpy-peer) fixes the problem.

- Mirko




Archives gérées par MHonArc 2.6.19+.

Haut de le page