Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] execution time takes forever with cuda+large matrices+ooc test case

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] execution time takes forever with cuda+large matrices+ooc test case


Chronologique Discussions 
  • From: Amani Alonazi <amani.alonazi@kaust.edu.sa>
  • To: starpu-devel@lists.gforge.inria.fr
  • Cc: Hatem Ltaief <Hatem.Ltaief@kaust.edu.sa>
  • Subject: Re: [Starpu-devel] execution time takes forever with cuda+large matrices+ooc test case
  • Date: Mon, 11 Feb 2019 13:23:19 +0300
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=amani.alonazi@kaust.edu.sa; spf=Pass smtp.mailfrom=amani.alonazi@kaust.edu.sa; spf=None smtp.helo=postmaster@mail-ed1-f52.google.com
  • Ironport-phdr: 9a23:tukUJRRNdZil3H59eRwQnrIAGNpsv+yvbD5Q0YIujvd0So/mwa6yYByN2/xhgRfzUJnB7Loc0qyK6/CmATRIyK3CmUhKSIZLWR4BhJdetC0bK+nBN3fGKuX3ZTcxBsVIWQwt1Xi6NU9IBJS2PAWK8TW94jEIBxrwKxd+KPjrFY7OlcS30P2594HObwlSizexfbB/IA+qoQnNq8IbnZZsJqEtxxXTv3BGYf5WxWRmJVKSmxbz+MK994N9/ipTpvws6ddOXb31cKokQ7NYCi8mM30u683wqRbDVwqP6WACXWgQjxFFHhLK7BD+Xpf2ryv6qu9w0zSUMMHqUbw5Xymp4rx1QxH0ligIKz858HnWisNuiqJbvAmhrAF7z4LNfY2ZKOZycqbbcNgHR2ROQ9xRWjRODYOybYQBD+QPM+VFoYfju1QDtgGxCRW2Ce711jNEmn370Ksn2OohCwHG2wkgEsoSvXTRttr1MqYSWv2pwanMyjXDaelZ2Db86IPVdR0uu/SMUqh2ccrQ00UgDQfFjlKWqYP/PjKV1/8As2uB4Op9TuKvl3QrpB9srTiy3MsskZPGi5sTx1vZ+yt5x4M1Kse5SE59edOkE4FftzyBN4tqWM8tXXxnuDsgxr0do5G2ejUBxpc/xxPHdfCLb4yF7gjgWeuROzt0mm5pdbGlixuy70StzPD3WNOu31ZQtCVFl8HBtnAT2BzX7ciKUv598V2g2TaLzgzc9PxLLV0tmarVK5Mt3KQ8lpUUsUTEES/2nFv5gLWKeUUj/+ik8+XnYrP4qZ+AL4J4lB3yP6A0lsG8Aek0KBYCU3SY9Oim1LDv41X1QLBQgf03lqnZvoraJcMepqOhGQBV15ws6xC+Dzu8ytsYmX4HLFRfdxOGjojkIFLOL+rlDfekn1SgiilkyO3bPrH7GZXCNGDPkK39crZl905c1A0zwMhE55JIEL4BOurzWlLouNPFEBA5LRK7w+L8BdV514MeQn6PArSDPKPdv1+I4fgvI+aSa4MPuTb9LeIl5//0gnMjl18dZ/rh4ZxCc2yxBOx7ZkmUf3foqtMACnsR+AUwS/blhRuDVyRSbjC8RfES/DY+XaujBIHCDqmph7qIzibzSpRTb2ZHTFmIFnHhZYSsWP4Rcz7UL8N81CcNA+vyA7Q93A2j4Vepg4FsKfDZr2hB7cq6hYpFotbLnBR3zgRaSsGU0mWDVWZxxz9aRDYrx7w5rEBgjEyKg/Eh365oUOdL7vYMaT8UcIbGxrUmWdP7RxndONqFVRC9S4f+WGxjfpcK29YLJn1FNZCigxTEhXf4BrYUk/mGCMVx/P+GmXf2IMl5xjDN06xz11Q=
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

I tried it on V100+spinning desk/P100+SSDs/V100+SSDs. Unfortunately, it does freeze with large matrices.

t appears to be connected to three things:
1) [starpu][starpu_interface_end_driver_copy_async] Warning: the submission of asynchronous transfer from NUMA 0 to Disk 0 took a very long time (0.452278 ms)
For proper asynchronous transfer overlapping, data registered to StarPU must be allocated with starpu_malloc() or pinned with starpu_memory_pin()

2) Right after, the message about updating the perfmodel history:
[starpu][_starpu_update_perfmodel_history] Too big deviation for model ssrc on cpu0_impl0 (Comb1): 44.715000 vs average 15.980000, 1 such errors against 1 samples (+179.818523%), flushing the performance model. Use the STARPU_HISTORY_MAX_ERROR environement variable to control the threshold (currently 50%)

3) Out-of-core using dmda scheduler. 

Many thanks for your time.
A.
On Mon, Feb 11, 2019 at 11:56 AM Amani Alonazi <amani.alonazi@kaust.edu.sa> wrote:
Dear starpu-dev,

A deadlock case (or extremely very very bad execution) is occurring from time to time. I'm sharing here a test case example that takes two matrices A and U of size M * NK, one matrix R of size NK * NK, and matrix W of size nM * NK with OOC enabled. You can run the application using this command:

STARPU_LIMIT_CPU_MEM=109600 STARPU_GENERATE_TRACE=1 STARPU_STATS=1 STARPU_WORKER_STATS=1 STARPU_PROFILING=1  STARPU_DISK_SWAP=$SWAP_DIR STARPU_DISK_SWAP_BACKEND=unistd STARPU_BUS_STATS=1 STARPU_SILENT=0 OMP_NUM_THREADS=1 MKL_NUM_THREADS=1 ./tb-modeling -i 1000 --freq=10 -m 250 -n 250 -k 250

I attach the code here. Please help.

Thanks,
Amani AlOnazi
PhD Student
Extreme Computing Research Center

King Abdullah University of Science and Technology
Al-Khawarizmi (Bldg.1)
Floor 0, Office 0203-WS02
Email: amani.alonazi@kaust.edu.sa

www.kaust.edu.sa




This message and its contents, including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.


Archives gérées par MHonArc 2.6.19+.

Haut de le page