Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] Executing Cholesky with StarPU

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] Executing Cholesky with StarPU


Chronologique Discussions 
  • From: giorgos matheou <geomat88@gmail.com>
  • To: Samuel Thibault <samuel.thibault@inria.fr>, giorgos matheou <geomat88@gmail.com>, starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] Executing Cholesky with StarPU
  • Date: Thu, 11 May 2017 15:00:44 +0300
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=geomat88@gmail.com; spf=Pass smtp.mailfrom=geomat88@gmail.com; spf=None smtp.helo=postmaster@mail-wm0-f54.google.com
  • Ironport-phdr: 9a23:wrF4cBCYLIwtt1ZYj7ufUyQJP3N1i/DPJgcQr6AfoPdwSPT/r8bcNUDSrc9gkEXOFd2CrakV16yO6+jJYi8p2d65qncMcZhBBVcuqP49uEgeOvODElDxN/XwbiY3T4xoXV5h+GynYwAOQJ6tL1LdrWev4jEMBx7xKRR6JvjvGo7Vks+7y/2+94fdbghMizexe69+IAmrpgjNq8cahpdvJLwswRXTuHtIfOpWxWJsJV2Nmhv3+9m98p1+/SlOovwt78FPX7n0cKQ+VrxYES8pM3sp683xtBnMVhWA630BWWgLiBVIAgzF7BbnXpfttybxq+Rw1DWGMcDwULs5Qiqp4bt1RxD0iScHLz85/3/Risxsl6JQvRatqwViz4LIfI2ZMfxzdb7fc9wHX2pMRsReVyJBDI2ybIUBEvQPMvpDoobnu1cDtwGzCRWwCO7tzDJDm3/43bc90+QkCQzG0w0gH9UUsHTbq9X1L7oZUeWvw6nUzTXMcfRW2Srg44XPfRAuv/aMXbdqfsrNykQiFBjIjlqVqYP/PjOV0v4BvHSc7+plTO+ijXMspQJpojW3xMohhZPFip8Lxl3E7yl0w5g5Kce4RUN0Z9OvDYFeuDuAN4RsR8MvW2Fotzg+yr0BoZO7eTIFyJUjxxLGZf2HdpSE7gvtVOuRLjp0nn1leLW4hxa99Uiv1PfwWdWz0FZPtiZFk9/MuW4R1xHL6MWKReFx8lq/1TuPzQze6f9ILVo0mKfUM5Ihx6Q/lpsXsUTNBC/2n0D2gbeZdko6/eio7v7oYqnlq5KfLIJ0hQT+Pb4vmsy7G+g3Lg8OX22D9eSmyLLj5VH5QKlNjvAuianZrIrVJd4Dpq6kGgNaz5gs6wihADeiy9kYmXgHLElZeBKclYTpNF/OIPfiDfewnVugijZrx/bcPr3gGJrBNHbDkK2yNYp6vnVVzRc+yZhj55NeA6wFPLqnQUb0qdHcSAM5Mga93uL7INR7zIIXH2yVVOvRP6LVq1CP/aoqKvWMaYgI8GLyJv8g7uK+3Vc2nFYcee+i2p5BLDijAv18O1jcbXfyj9MpFWYRohF4Q+LtklKPFz9VfXe7GawmtR8hD4fzAobEXI+qm/SE1Tu7E5pLLjRJB1eJGGu4J62LXv4NbGSZJco3wW9MbqSoV4J0jULmjwT90bcya7OMoiA=
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Thank you very much Samuel.

Well I need the Cholesky benchmarks for comparison purposes. So it is OK since other frameworks have the same implementation.

I am trying to execute the Cholesky on an HPC system with Matrix Size=61440x61440 and Tile Size=256. For the execution I am using 16 nodes where each node has 12 cores.

The execution terminates with the following error:
[n081:29365] too many retries sending message to 0x0075:0x00030d11, giving up
[n082:25251] too many retries sending message to 0x0075:0x00030d11, giving up

Also, I am receiving these warnings:
[starpu][check_bus_config_file] No performance model for the bus, calibrating...
[starpu][check_bus_config_file] ... done


For each node:
[starpu][_starpu_mpi_print_thread_level_support] MPI_Init_thread level = MPI_THREAD_SERIALIZED; Multiple threads may make MPI calls, but only one at a time.

Are there warnings a problem?

For the execution I use these commands:
export STARPU_CALIBRATE=1

STARPU_SCHED=dmdas mpirun --npernode 1 ./starpu-1.1.7/mpi/examples/matrix_decomposition/mpi_cholesky_distributed -size $MATRIX_SIZE -nblocks $((MATRIX_SIZE/BLOCK_SIZE))

Kind Regards,
GM



On Thu, May 11, 2017 at 12:59 PM, Samuel Thibault <samuel.thibault@inria.fr> wrote:
Hello,

giorgos matheou, on jeu. 11 mai 2017 01:19:11 +0300, wrote:
> Hello. I have installed the StarPU framework and I would like to execute the
> distributed cholesky version. However I am little bit confused with the command
> line arguments.
>
> Actually, I would like to get strong scalability results, i.e. increase the
> number of nodes while the matrix size is the same.
>
> So lets say that for the single node implementation the -size=8192 and the
> -nblocks=64, i.e. Matrix Size=8192x8192 and Tile Size=128x128
>
> How should I configure the size and nblocks flags to have the same matrix size
> for 2 nodes and 4 nodes?

The -size and -blocks parameters are for the whole matrix. So you just
need to change the mpirun/mpiexec invocation, the parameters of the
program can remain the same.

That being said, the cholesky example provided by StarPU is not
necessarily the state of the art. Chameleon provides an up-to-date
distributed implementation (which is notably able to check the result in
a scalable way)

Samuel




Archives gérées par MHonArc 2.6.19+.

Haut de le page