Objet : Developers list for StarPU
Archives de la liste
- From: Samuel Thibault <samuel.thibault@inria.fr>
- To: Yizhou Qian <adncat@stanford.edu>
- Cc: "starpu-devel@lists.gforge.inria.fr" <starpu-devel@lists.gforge.inria.fr>
- Subject: Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes
- Date: Wed, 20 Nov 2019 18:29:19 +0100
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
- Organization: I am not organized
Hello,
Yizhou Qian, le mer. 06 nov. 2019 06:58:38 +0000, a ecrit:
> Thanks! I tried using 4 nodes with 8 and 24 cores on each node respectively
> with a matrix of size 1000*20. However the result shows that using 24 cores
> on
> each node is slower than using 8 cores on each node. Is this normal
No, that's quite surprising. The matrix size is not very big so you
can expect low parallelism, but adding more cores should not hurt
that much. Just one thing: when you do not have GPUs, normally lws is
preferred over dmdas, since there is no heterogeneity and data placement
questions. Just to make sure: do you have really 24 cores, and not 12
cores with 2 hyperthreads per core? Also, possibly there are some NUMA
effects, it would be good to try with an increasing number of cores, to
see whether perhaps there is a tilt at the NUMA size for such a small
testcase.
Samuel
- Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes, Yizhou Qian, 06/11/2019
- Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes, Samuel Thibault, 20/11/2019
- Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes, Samuel Thibault, 20/11/2019
- Re: [Starpu-devel] Running Cholesky_implicit on 2 gpu nodes, Samuel Thibault, 20/11/2019
Archives gérées par MHonArc 2.6.19+.