Objet : Developers list for StarPU
Archives de la liste
- From: Kadir Akbudak <kadir.akbudak@kaust.edu.sa>
- To: starpu-devel@lists.gforge.inria.fr
- Cc: Hatem Ltaief <hatem.ltaief@kaust.edu.sa>
- Subject: [Starpu-devel] StarPU will cope by trying to purge
- Date: Tue, 20 Jun 2017 11:28:52 +0300
- Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=kadir.akbudak@kaust.edu.sa; spf=Pass smtp.mailfrom=kadir.akbudak@kaust.edu.sa; spf=None smtp.helo=postmaster@mail-io0-f174.google.com
- Ironport-phdr: 9a23:OxVi9RAkil23BxzQe/11UyQJP3N1i/DPJgcQr6AfoPdwSPT7ocbcNUDSrc9gkEXOFd2CrakV1KyO6+jJYi8p2d65qncMcZhBBVcuqP49uEgeOvODElDxN/XwbiY3T4xoXV5h+GynYwAOQJ6tL1LdrWev4jEMBx7xKRR6JvjvGo7Vks+7y/2+94fdbghMhjexe69+IAmrpgjNq8cahpdvJLwswRXTuHtIfOpWxWJsJV2Nmhv3+9m98p1+/SlOovwt78FPX7n0cKQ+VrxYES8pM3sp683xtBnMVhWA630BWWgLiBVIAgzF7BbnXpfttybxq+Rw1DWGMcDwULs5Qiqp4bt1RxD0iScHLz85/3/Risxsl6JQvRatqwViz4LIfI2ZMfxzdb7fc9wHX2pMRsZfWSJCDI2hcYUAE/EMPfpEo4Tnu1cCsQeyCAuqCejyyjFInHj23agi3uokCw7JwQ0gH8gTu3nIr9X6L7oSXv2vw6nL0D7OaPBW1i3g54jJaBAtu+uDXbFrfsrM1EYgDQDFg06TqYP/IjOVzfgNv3KF4OV9SOKjkXUnpBtorzWp28wiiZHJi5oLxlzY8Sh12oU4KN2iREJlf9KpHoFcuzyYOoZ5RM4pXntmtzwgyrIcvJ62ZCgKx4ojxx7Yc/GHdpKH4hPnVOqIIDd4g25pdKuxhxqv80WtxffwVsaz0FZNoSpFlsfDumoR2BzU78iLUvp9/kG/1jaTzw3f9P1ILEQumafYK5Mt2KA8moQdvEjZAyP7mkr7gLeTdko+++io7+rnYq/hpp+ZL4J0iRvxPbkvmsylG+g3KAsOUHOG+euhzrLj/lb0QLZQgf0rianZrIrWKtoGqa6kGQNVyZws5w6lDzi6yNQYgWUHLFVddRKck4jmIUvOIOjlDfumnlujjilryO7CPrD6BpXNL2PDkKv6fbpn5E5cxg0zzcpQ551KEL0BIfTzWlXwtNPCFBM5PRa0kK7bD4BmyooEQX/KDqKHPaf6tV6T+vlpLOeLfoAY/jf7MfksofD03lEjnlpIXqCl3IFfQXOyA/hvIA3NYnPgg81HGGQAowY3T8TpgUCZTXhea2v0RKtqtWJzM56vEYqWHtPlu7eGxiruW8QOPm0=
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Dear StarPU Developers,
I am getting the following error/warning and my program stalls forever:
[starpu][starpu_memchunk_tidy] Low memory left on node RAM 0 (6545MiB
over 130940MiB). Your application data set seems too huge to fit on
the device, StarPU will cope by trying to purge 6548 MiB out. This
message will not be printed again for further purges. The thresholds
can be tuned using the STARPU_MINIMUM_AVAILABLE_MEM and
STARPU_TARGET_AVAILABLE_MEM environment variables.
[starpu][_starpu_memory_reclaim_generic] Not enough memory left on
node RAM 0. Your application data set seems too huge to fit on the
device, StarPU will cope by trying to purge 32735 MiB out. This
message will not be printed again for further purges
I tried to play with these environment variables but, they did not help:
STARPU_MINIMUM_AVAILABLE_MEM
STARPU_TARGET_AVAILABLE_MEM
STARPU_DISK_SWAP_SIZE=0
STARPU_LIMIT_CPU_MEM=110000
I am using Cray XC40 system. Each node has 128GB RAM.
I tried StarPU 1.2.1 and StarPU 1.2.2.
I have 31 threads on a single node. I use 2 nodes. There is 1 MPI
process on each node.
The following experiment (A) does NOT generate any warning:
M=N=139876 and MB=NB=1156. Total memory for matrices~=146GB, each
codelet requires 3/2 MB^2 + 3MB= STARPU_SCRATCH memory. So 31 threads
on a node require 0.015GB*31~=0.5GB temporary buffer.
The following larger experiment (B) generates the above-mentioned warning:
M=N=166464 and MB=NB=1156. Total memory for matrices~=206GB, each
codelet requires 3/2 MB^2 + 3MB=0.015GB STARPU_SCRATCH memory. So 31
threads on a node require 0.015GB*31~=0.5GB temporary buffer.
There are 2x128GB=256GB RAM in total on 2 nodes. Experiment (B) needs
206.5GB. This size is well below the total amount of memory 256GB.
Would you please suggest me how to find out why my program stalls? Can
I make disable Out-Of-Core (OOC) support of StarPU so that StarPU does
not try to purge memory?
I ran Experiment (B) on a system with 256GB RAM. It worked without any
warning. On this system, I also ran for 1 MPI process and 2 MPI
processes. The amount of memory required for each process scales as
expected. That is 206 GB is required for 1 MPI process and 103GB is
required for each of 2 MPI processes.
I appreciate your help on this issue.
Thanks a lot.
Best regards,
Kadir Akbudak
Extreme Computing Research Center
--
------------------------------
This message and its contents, including attachments are intended solely
for the original recipient. If you are not the intended recipient or have
received this message in error, please notify me immediately and delete
this message from your computer system. Any unauthorized use or
distribution is prohibited. Please consider the environment before printing
this email.
- [Starpu-devel] StarPU will cope by trying to purge, Kadir Akbudak, 20/06/2017
- Re: [Starpu-devel] StarPU will cope by trying to purge, Samuel Thibault, 20/06/2017
- Re: [Starpu-devel] StarPU will cope by trying to purge, Kadir Akbudak, 22/06/2017
- Re: [Starpu-devel] StarPU will cope by trying to purge, Samuel Thibault, 26/06/2017
- Re: [Starpu-devel] StarPU will cope by trying to purge, Kadir Akbudak, 26/06/2017
- Re: [Starpu-devel] StarPU will cope by trying to purge, Samuel Thibault, 26/06/2017
- Re: [Starpu-devel] StarPU will cope by trying to purge, Kadir Akbudak, 22/06/2017
- <Suite(s) possible(s)>
- [Starpu-devel] StarPU will cope by trying to purge, Kadir Akbudak, 20/06/2017
- Re: [Starpu-devel] StarPU will cope by trying to purge, Samuel Thibault, 20/06/2017
Archives gérées par MHonArc 2.6.19+.