Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] StarPU Asynchronous Partitioning

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] StarPU Asynchronous Partitioning


Chronologique Discussions 
  • From: Martin Khannouz <martin.khannouz@inria.fr>
  • To: Samuel Thibault <samuel.thibault@inria.fr>, starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] StarPU Asynchronous Partitioning
  • Date: Thu, 26 May 2016 16:59:21 +0200
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Bonjour,

Après avoir changé les starpu_variable_data_register en starpu_vector_data_register (et avoir fait les changement approprié dans les codelets et dans l'appel de la fonction), j'ai un segfault lors du starpu_data_partition_plan.

La backtrace:
#0  0x0000000000fa0240 in ?? ()
#1  0x00007ffff5a5b3ff in _starpu_data_partition (initial_handle=0x1047930,
    childrenp=0x7fffffffa548, nparts=2, f=0x7fffffffa550, inherit_state=0)
    at datawizard/filters.c:283
#2  0x00007ffff5a5c959 in starpu_data_partition_plan (initial_handle=0x1047930,
    f=0x7fffffffa550, childrenp=0x7fffffffa548) at datawizard/filters.c:539
#3  0x00000000004c592b in FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::mpiPostISend (
    this=0x7fffffffa870, handle=0x1047930, dest=0, level=5, startingIndex=2048,
    mode=0)
    at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/FGroupTaskStarpuMpiAlgorithm.hpp:1652
#4  0x00000000004bf6c8 in FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::insertParticlesSend (
    this=0x7fffffffa870)
    at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/FGroupTaskStarpuMpiAlgorithm.hpp:1313
#5  0x00000000004bda2b in FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::executeCore (
    this=0x7fffffffa870, operationsToProceed=63)
    at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/FGroupTaskStarpuMpiAlgorithm.hpp:439
#6  0x0000000000498af7 in FAbstractAlgorithm::execute (this=0x7fffffffa870)
    at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/../../Core/FCoreCommon.hpp:91
#7  0x00000000004881f9 in main (argc=7, argv=0x7fffffffde28)
    at /home/mkhannou/scalfmm/Tests/GroupTree/testBlockedMpiAlgorithm.cpp:154


Dans le doute, une trace Valgrind de l'erreur:
unhandled opc_aux = 0x 3
first_opcode == 0xDE
vex amd64->IR: unhandled instruction bytes: 0xDE 0x9E 0x31 0x3 0xED 0xA0 0x3F 0x0
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==5411== valgrind: Unrecognised instruction at address 0x1a99c999.
==5411==    at 0x1A99C999: ???
==5411==    by 0x6ED5958: starpu_data_partition_plan (filters.c:539)
==5411==    by 0x4C592A: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::mpiPostISend(_starpu_data_state*, int, int, long long, int) (FGroupTaskStarpuMpiAlgorithm.hpp:1652)
==5411==    by 0x4BF6C7: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::insertParticlesSend() (FGroupTaskStarpuMpiAlgorithm.hpp:1313)
==5411==    by 0x4BDA2A: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::executeCore(unsigned int) (FGroupTaskStarpuMpiAlgorithm.hpp:439)
==5411==    by 0x498AF6: FAbstractAlgorithm::execute() (FCoreCommon.hpp:91)
==5411==    by 0x4881F8: main (testBlockedMpiAlgorithm.cpp:154)
==5411== Your program just tried to execute an instruction that Valgrind
==5411== did not recognise.  There are two possible reasons for this.
==5411== 1. Your program has a bug and erroneously jumped to a non-code
==5411==    location.  If you are running Memcheck and you just saw a
==5411==    warning about a bad jump, it's probably your program's fault.
==5411== 2. The instruction is legitimate but Valgrind doesn't handle it,
==5411==    i.e. it's Valgrind's fault.  If you think this is the case or
==5411==    you are not sure, please let us know and we'll try to fix it.
==5411== Either way, Valgrind will now raise a SIGILL signal which will
==5411== probably kill your program.
==5411==
==5411== Process terminating with default action of signal 4 (SIGILL): dumping core
==5411==  Illegal opcode at address 0x1A99C999
==5411==    at 0x1A99C999: ???
==5411==    by 0x6ED5958: starpu_data_partition_plan (filters.c:539)
==5411==    by 0x4C592A: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::mpiPostISend(_starpu_data_state*, int, int, long long, int) (FGroupTaskStarpuMpiAlgorithm.hpp:1652)
==5411==    by 0x4BF6C7: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::insertParticlesSend() (FGroupTaskStarpuMpiAlgorithm.hpp:1313)
==5411==    by 0x4BDA2A: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::executeCore(unsigned int) (FGroupTaskStarpuMpiAlgorithm.hpp:439)
==5411==    by 0x498AF6: FAbstractAlgorithm::execute() (FCoreCommon.hpp:91)
==5411==    by 0x4881F8: main (testBlockedMpiAlgorithm.cpp:154)


Martin.

On 26/05/2016 15:26, Samuel Thibault wrote:
Martin Khannouz, on Thu 26 May 2016 15:14:53 +0200, wrote:
Je viens de me rendre compte que les handles ne sont pas enregistrés
comme des vector, ils sont enregistrés avec la fonction
starpu_variable_data_register. Or le filtre utilise la fonction
starpu_vector_filter_block, cela pourrait-il être lié au problème ?
Oula, oui, carrément !

Si oui, quelle fonction utiliser car je n'ai pas vu de fonction
starpu_variable_filter_block ou équivalent ?
Heu, ben c'est normal: une variable ce n'est pas découpable :)

Il faut l'enregistrer en tant que vecteur, pour que StarPU sache comment
le découper.

Samuel




Archives gérées par MHonArc 2.6.19+.

Haut de le page