starpu-devel - Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes

From: Samuel Thibault <samuel.thibault@inria.fr>
To: Yizhou Qian <adncat@stanford.edu>
Cc: "starpu-devel@lists.gforge.inria.fr" <starpu-devel@lists.gforge.inria.fr>
Subject: Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes
Date: Wed, 20 Nov 2019 18:29:19 +0100
List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Organization: I am not organized

Hello,

Yizhou Qian, le mer. 06 nov. 2019 06:58:38 +0000, a ecrit:
> Thanks! I tried using 4 nodes with 8 and 24 cores on each node respectively
> with a matrix of size 1000*20. However the result shows that using 24 cores
> on
> each node is slower than using 8 cores on each node. Is this normal

No, that's quite surprising. The matrix size is not very big so you
can expect low parallelism, but adding more cores should not hurt
that much. Just one thing: when you do not have GPUs, normally lws is
preferred over dmdas, since there is no heterogeneity and data placement
questions. Just to make sure: do you have really 24 cores, and not 12
cores with 2 hyperthreads per core? Also, possibly there are some NUMA
effects, it would be good to try with an increasing number of cores, to
see whether perhaps there is a tilt at the NUMA size for such a small
testcase.

Samuel

Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes, Yizhou Qian, 06/11/2019
- Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes, Samuel Thibault, 20/11/2019
  - Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes, Samuel Thibault, 20/11/2019

Archives gérées par MHonArc 2.6.19+.

Archives de la liste

Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes