Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] StarPU code hangs when using a lot of GPU memory

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] StarPU code hangs when using a lot of GPU memory


Chronologique Discussions 
  • From: Samuel Thibault <samuel.thibault@inria.fr>
  • To: Mirko Myllykoski <mirkom@cs.umu.se>
  • Cc: Starpu Devel <starpu-devel@lists.gforge.inria.fr>
  • Subject: Re: [Starpu-devel] StarPU code hangs when using a lot of GPU memory
  • Date: Fri, 1 Mar 2019 09:18:36 -0800
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=samuel.thibault@inria.fr; spf=Neutral smtp.mailfrom=samuel.thibault@ens-lyon.org; spf=Pass smtp.helo=postmaster@hera.aquilenet.fr
  • Ironport-phdr: 9a23:j2IElhw/YoBxyY3XCy+O+j09IxM/srCxBDY+r6Qd2+kXIJqq85mqBkHD//Il1AaPAd2Lraocw8Pt8InYEVQa5piAtH1QOLdtbDQizfssogo7HcSeAlf6JvO5JwYzHcBFSUM3tyrjaRsdF8nxfUDdrWOv5jAOBBr/KRB1JuPoEYLOksi7ze+/94HQbglSmDaxfa55IQmrownWqsQYm5ZpJLwryhvOrHtIeuBWyn1tKFmOgRvy5dq+8YB6/ShItP0v68BPUaPhf6QlVrNYFygpM3o05MLwqxbOSxaE62YGXWUXlhpIBBXF7A3/U5zsvCb2qvZx1S+HNsDtU7s6RSqt4LtqSB/wiScIKTg58H3MisdtiK5XuQ+tqwBjz4LRZoyeKfhwcb7Hfd4CS2RPXthfWTFCDIOyYIQAE/cOMuRfr4bzvFYOsQeyCBOwCO/z0DJEmHn71rA63eQ7FgHG2RQtEc8Uv3TRsdX6KqMSWv2rwqnIzDXMdOlZ2Sr56IjUbxsspuqMUqh1ccrM10YvExnJjkmQqYzkJDyazPkNs2yH7+d7VOKvjXQnqwBrrTS1yMcskJDEi4QIwV7H7SV02Js5KN64RUJhf9KpHpVduzuHO4dqXs8uWXxktD41x7EcuZO2cjIGxIknyhPfcfCLboyF7xDlWe2MOzl3nmhld6i6hxuq8Uiv1On8Vs6s3VZKsCVFiMPDumoX2BzK98iHS/998Vmn2TmRywDf8O9EIVosmaraLZ4u3KIwm4IOvUnAHiL6glv6gaGYe0k+5+Sk9/jrbq/7qpOEMo97kAD+MqAgmsylBuQ4NxADX3OB9uS5yb3s40n5TK9Wjv02k6nWq4rVJdkfpq6lGAJazIAj6w2mAzei0NUYmn8HIEhCeBKdgIjlI0vOL+zgDfejn1Ssly9mx+vaPrL7GJXNNmXDnK78crlj9U5T1g4zwMtb55JVEbEBPOnzVlX+tNzWCR85KQO0zPj9BNV80IMeQ2OPDbWDPKPcq1/brt4odsyNfowS8BPsL/w05Pn1jn5xzVocZ6qu2LMcczalG+kgOEjPMlT2hdJUKm4Powc6BNDqiVeLTDpPLyKpVqch6zV9FIKnB47eQpyFgbqb3S79EIcANTMOMUyFDXq9L9bMYPwLci/HZ5Y5ymVVB4jkcJco0FSVjCG/zrNmKuTO/ShB6cDu0sMw4/zUk1c17zMmVZ3BgVHIdHl9myYzfxFzxLp2+BQvy1GZ0KE+jeYKTYUOtcMMaR8zMNvn98I/C932XVuQLNeEUl/gS8y7ADU8SN8thdEUMR9w
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
  • Organization: I am not organized

Mirko Myllykoski, le ven. 01 mars 2019 18:11:57 +0100, a ecrit:
> The output of the macro was not very informative:
[...]
> Node 1:
> Total used: 5815, 4493MiB
> WT: 0, 0MiB
> home: 0, 0MiB
> redux: 0, 0MiB
> relax: 0, 0MiB
> noref: 0, 0MiB
> nosubdataref: 0, 0MiB
> nodataref: 0, 0MiB
>
> cached: 0, 0MiB

It actually is: nosubdataref and nodataref mean that all of the 5815 are
considered busy, and that's the reason why they can't be evicted.

Could it be that you have a *lot* of tasks which are ready? It might
just be that they all almost have their data on the GPU, but none of
them has all it needs, and thus none of them can be executed. It is a
fairly extreme situation that we did envision, but didn't put management
in StarPU yet. That's however precisely the kind of work I'll be having
a look at in the coming months :)

Samuel




Archives gérées par MHonArc 2.6.19+.

Haut de le page