Objet : Developers list for StarPU
Archives de la liste
- From: Martin Khannouz <martin.khannouz@inria.fr>
- To: Samuel Thibault <samuel.thibault@inria.fr>, starpu-devel@lists.gforge.inria.fr
- Subject: Re: [Starpu-devel] StarPU Asynchronous Partitioning
- Date: Thu, 26 May 2016 16:59:21 +0200
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Bonjour, Après avoir changé les starpu_variable_data_register en starpu_vector_data_register (et avoir fait les changement approprié dans les codelets et dans l'appel de la fonction), j'ai un segfault lors du starpu_data_partition_plan. La backtrace: #0 0x0000000000fa0240 in ?? () #1 0x00007ffff5a5b3ff in _starpu_data_partition (initial_handle=0x1047930, childrenp=0x7fffffffa548, nparts=2, f=0x7fffffffa550, inherit_state=0) at datawizard/filters.c:283 #2 0x00007ffff5a5c959 in starpu_data_partition_plan (initial_handle=0x1047930, f=0x7fffffffa550, childrenp=0x7fffffffa548) at datawizard/filters.c:539 #3 0x00000000004c592b in FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::mpiPostISend ( this=0x7fffffffa870, handle=0x1047930, dest=0, level=5, startingIndex=2048, mode=0) at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/FGroupTaskStarpuMpiAlgorithm.hpp:1652 #4 0x00000000004bf6c8 in FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::insertParticlesSend ( this=0x7fffffffa870) at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/FGroupTaskStarpuMpiAlgorithm.hpp:1313 #5 0x00000000004bda2b in FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::executeCore ( this=0x7fffffffa870, operationsToProceed=63) at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/FGroupTaskStarpuMpiAlgorithm.hpp:439 #6 0x0000000000498af7 in FAbstractAlgorithm::execute (this=0x7fffffffa870) at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/../../Core/FCoreCommon.hpp:91 #7 0x00000000004881f9 in main (argc=7, argv=0x7fffffffde28) at /home/mkhannou/scalfmm/Tests/GroupTree/testBlockedMpiAlgorithm.cpp:154 Dans le doute, une trace Valgrind de l'erreur: unhandled opc_aux = 0x 3 first_opcode == 0xDE vex amd64->IR: unhandled instruction bytes: 0xDE 0x9E 0x31 0x3 0xED 0xA0 0x3F 0x0 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==5411== valgrind: Unrecognised instruction at address 0x1a99c999. ==5411== at 0x1A99C999: ??? ==5411== by 0x6ED5958: starpu_data_partition_plan (filters.c:539) ==5411== by 0x4C592A: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::mpiPostISend(_starpu_data_state*, int, int, long long, int) (FGroupTaskStarpuMpiAlgorithm.hpp:1652) ==5411== by 0x4BF6C7: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::insertParticlesSend() (FGroupTaskStarpuMpiAlgorithm.hpp:1313) ==5411== by 0x4BDA2A: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::executeCore(unsigned int) (FGroupTaskStarpuMpiAlgorithm.hpp:439) ==5411== by 0x498AF6: FAbstractAlgorithm::execute() (FCoreCommon.hpp:91) ==5411== by 0x4881F8: main (testBlockedMpiAlgorithm.cpp:154) ==5411== Your program just tried to execute an instruction that Valgrind ==5411== did not recognise. There are two possible reasons for this. ==5411== 1. Your program has a bug and erroneously jumped to a non-code ==5411== location. If you are running Memcheck and you just saw a ==5411== warning about a bad jump, it's probably your program's fault. ==5411== 2. The instruction is legitimate but Valgrind doesn't handle it, ==5411== i.e. it's Valgrind's fault. If you think this is the case or ==5411== you are not sure, please let us know and we'll try to fix it. ==5411== Either way, Valgrind will now raise a SIGILL signal which will ==5411== probably kill your program. ==5411== ==5411== Process terminating with default action of signal 4 (SIGILL): dumping core ==5411== Illegal opcode at address 0x1A99C999 ==5411== at 0x1A99C999: ??? ==5411== by 0x6ED5958: starpu_data_partition_plan (filters.c:539) ==5411== by 0x4C592A: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::mpiPostISend(_starpu_data_state*, int, int, long long, int) (FGroupTaskStarpuMpiAlgorithm.hpp:1652) ==5411== by 0x4BF6C7: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::insertParticlesSend() (FGroupTaskStarpuMpiAlgorithm.hpp:1313) ==5411== by 0x4BDA2A: FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::executeCore(unsigned int) (FGroupTaskStarpuMpiAlgorithm.hpp:439) ==5411== by 0x498AF6: FAbstractAlgorithm::execute() (FCoreCommon.hpp:91) ==5411== by 0x4881F8: main (testBlockedMpiAlgorithm.cpp:154) Martin. On 26/05/2016 15:26, Samuel Thibault
wrote:
Martin Khannouz, on Thu 26 May 2016 15:14:53 +0200, wrote: Je viens de me rendre compte que les handles ne sont pas enregistrés comme des vector, ils sont enregistrés avec la fonction starpu_variable_data_register. Or le filtre utilise la fonction starpu_vector_filter_block, cela pourrait-il être lié au problème ? Oula, oui, carrément ! Si oui, quelle fonction utiliser car je n'ai pas vu de fonction starpu_variable_filter_block ou équivalent ? Heu, ben c'est normal: une variable ce n'est pas découpable :) Il faut l'enregistrer en tant que vecteur, pour que StarPU sache comment le découper. Samuel |
- [Starpu-devel] StarPU Asynchronous Partitioning, Martin Khannouz, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Samuel Thibault, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Martin Khannouz, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Martin Khannouz, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Samuel Thibault, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Martin Khannouz, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Samuel Thibault, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Martin Khannouz, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Samuel Thibault, 27/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Samuel Thibault, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Martin Khannouz, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Samuel Thibault, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Martin Khannouz, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Martin Khannouz, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Samuel Thibault, 26/05/2016
- Re: [Starpu-devel] StarPU Asynchronous Partitioning, Samuel Thibault, 26/05/2016
Archives gérées par MHonArc 2.6.19+.