Accéder au contenu.
Menu Sympa

starpu-devel - [Starpu-devel] StarPU Asynchronous Partitioning

Objet : Developers list for StarPU

Archives de la liste

[Starpu-devel] StarPU Asynchronous Partitioning


Chronologique Discussions 
  • From: Martin Khannouz <martin.khannouz@inria.fr>
  • To: starpu-devel@lists.gforge.inria.fr
  • Subject: [Starpu-devel] StarPU Asynchronous Partitioning
  • Date: Thu, 26 May 2016 13:33:03 +0200
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Bonjour,

Je m'essaye au partitionnement asynchrone des handles et je rencontre une erreur.
Un  free(): invalid next size (fast). Il apparaît lors d'un premier appel à starpu_data_partition_submit. Je me demandais si je n'avais pas fait quelque chose qui n'allait pas et qui produirait pareille erreur, car j'ai eu beau lire la doc sur ces fonctions, a priori je ne fais rien d'anormale.

Merci,
Martin Khannouz.

Le code qui pose problème:
void mpiPostIRecv(starpu_data_handle_t handle, const int dest, const int level, const MortonIndex startingIndex, const int mode)
{
        size_t size = starpu_data_get_size(handle);
        const size_t limitSize = 10000;
        if( size < limitSize)
        {
            starpu_mpi_irecv_detached(handle, dest,
                    getTag(level,startingIndex, mode),
                    comm.getComm(), 0/*callback*/, 0/*arg*/ );
            return;
        }
        const int countPart = static_cast<int>(ceil(static_cast<float>(size)/static_cast<float>(limitSize)));
        struct starpu_data_filter filter =
        {
            .filter_func = starpu_vector_filter_block,
            .nchildren = countPart
        };
        starpu_data_handle_t splitHandles[countPart];
        starpu_data_partition_plan(handle, &filter, splitHandles);
        starpu_data_partition_submit(handle, countPart, splitHandles); //L'erreur survient ici
        .... Some code ...
}


La backtrace:
#0  0x00007ffff4a2e295 in raise () from /lib64/libc.so.6
#1  0x00007ffff4a2f6da in abort () from /lib64/libc.so.6
#2  0x00007ffff4a69d50 in __libc_message () from /lib64/libc.so.6
#3  0x00007ffff4a6f546 in malloc_printerr () from /lib64/libc.so.6
#4  0x00007ffff4a6fd1e in _int_free () from /lib64/libc.so.6
#5  0x00007ffff59ef0ce in _starpu_task_destroy (task=0xe76350) at core/task.c:191
#6  0x00007ffff59ed0b7 in _starpu_handle_job_termination (j=0xe724b0)
    at core/jobs.c:463
#7  0x00007ffff5a52c60 in _starpu_fetch_nowhere_task_input_cb (arg=0x10f3380)
    at datawizard/coherency.c:1162
#8  0x00007ffff5a506b4 in _starpu_create_request_to_fetch_data (handle=0xe8bb40,
    dst_replicate=0xe8bc28, mode=STARPU_W, is_prefetch=0, async=1,
    callback_func=0x7ffff5a52b1a <_starpu_fetch_nowhere_task_input_cb>,
    callback_arg=0x10f3380, prio=0) at datawizard/coherency.c:567
#9  0x00007ffff5a5100b in _starpu_fetch_data_on_node (handle=0xe8bb40,
    dst_replicate=0xe8bc28, mode=STARPU_W, detached=0, is_prefetch=0, async=1,
    callback_func=0x7ffff5a52b1a <_starpu_fetch_nowhere_task_input_cb>,
    callback_arg=0x10f3380, prio=0) at datawizard/coherency.c:728
#10 0x00007ffff5a52ad4 in _starpu_fetch_nowhere_task_input (j=0xe724b0)
    at datawizard/coherency.c:1140
#11 0x00007ffff5a27cd6 in _starpu_repush_task (j=0xe724b0) at core/sched_policy.c:452
#12 0x00007ffff5a279aa in _starpu_push_task (j=0xe724b0) at core/sched_policy.c:391
#13 0x00007ffff59eda8a in _starpu_enforce_deps_and_schedule (j=0xe724b0)
    at core/jobs.c:622
#14 0x00007ffff59ef923 in _starpu_submit_job (j=0xe724b0) at core/task.c:372
#15 0x00007ffff59f0a0f in starpu_task_submit (task=0xe76350) at core/task.c:682
#16 0x00007ffff5a82e23 in _starpu_task_insert_v (cl=0xe76720,
    varg_list=0x7fffffffa1b0) at util/starpu_task_insert.c:91
#17 0x00007ffff5a82fa4 in starpu_task_insert (cl=0xe76720)
    at util/starpu_task_insert.c:113
#18 0x00007ffff5a5cf3e in starpu_data_partition_submit (initial_handle=0xe5c780,
    nparts=2, children=0x7fffffffa460) at datawizard/filters.c:585
#19 0x00000000004c5697 in FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::mpiPostIRecv (
    this=0x7fffffffa870, handle=0xe5c780, dest=1, level=5, startingIndex=2048,
    mode=0)
    at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/FGroupTaskStarpuMpiAlgorithm.hpp:1683
#20 0x00000000004bf37c in FGroupTaskStarPUMpiAlgorithm<FGroupTree<double, FTestCellPOD, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::postRecvAllocatedBlocks
    (this=0x7fffffffa870)
    at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/FGroupTaskStarpuMpiAlgorithm.hpp:1291
---Type <return> to continue, or q <return> to quit---
, FBasicCellPOD, long long, long long, FGroupTestParticleContainer<double>, 0u, 1u, long long>, FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FStarPUCpuWrapper<FGroupOfCells<FTestCellPOD, FBasicCellPOD, long long, long long>, FTestCellPOD, FStarPUAllCpuCapacities<FTestKernels<FTestCellPOD, FGroupTestParticleContainer<double> > >, FGroupOfParticles<double, 0u, 1u, long long>, FGroupTestParticleContainer<double> > >::executeCore (this=0x7fffffffa870, operationsToProceed=63)
    at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/FGroupTaskStarpuMpiAlgorithm.hpp:437
#22 0x0000000000498ae3 in FAbstractAlgorithm::execute (this=0x7fffffffa870) at /home/mkhannou/scalfmm/Tests/GroupTree/../../Src/GroupTree/Core/../../Core/FCoreCommon.hpp:91
#23 0x0000000000488209 in main (argc=7, argv=0x7fffffffde28) at /home/mkhannou/scalfmm/Tests/GroupTree/testBlockedMpiAlgorithm.cpp:154




Archives gérées par MHonArc 2.6.19+.

Haut de le page