Objet : Developers list for StarPU
Archives de la liste
- From: ludovic.courtes@inria.fr (Ludovic Courtès)
- To: mehdi.amini@silkan.com
- Cc: starpu-devel@lists.gforge.inria.fr
- Subject: Re: [Starpu-devel] CUDA task and GCC plugin
- Date: Tue, 21 Aug 2012 10:39:10 +0200
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Hi Mehdi,
Sorry for the late reply.
Mehdi AMINI <mehdi.amini@silkan.com> skribis:
[...]
>> My guess is that -O3 may trigger an inlining of the original implicit
>> implementation at call site so that the StarPU wrapper is never called
>> and instead there directly the code of the CPU function.
>
>
> I confirmed it studying the assembly generated for a simple example
> (see attached).
>
> The simple printf in the implicit CPU implementation for the StarPU
> task is inlined and there not call anymore in the main function.
Hmm, that’s not what I’m seeing here, see comments below:
--8<---------------cut here---------------start------------->8---
.LC2:
.string "CPU"
.text
.type launch.cpu_implementation, @function
launch.cpu_implementation:
.LFB54:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movq %rdi, -8(%rbp)
movl $.LC2, %eax ; only use of the "CPU" string
movq %rax, %rdi
movl $0, %eax
call printf
leave
[...]
main:
.LFB56:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $112, %rsp
movl %edi, -100(%rbp)
movq %rsi, -112(%rbp)
movl $0, %edi
call starpu_init
movl %eax, -4(%rbp)
cmpl $0, -4(%rbp)
je .L6
[...]
.L6:
leaq -96(%rbp), %rax
movq %rax, %rdi
call launch ; out-of-line call to ‘launch’,
; not ‘launch.cpu_implementation’
leave
--8<---------------cut here---------------end--------------->8---
I tried to produce the problem, for instance by adding the
‘always_inline’ attribute to the task and compiling with -O3, and I
wasn’t able to come up with code where the implicit CPU implementation
gets inlined at the call site (using StarPU trunk, but that shouldn’t be
different with 1.0, and there are tests for that.)
> By the way it seems that there is another bug, the plugin segfaults
> when there is a task without any parameter.
I wasn’t able to reproduce it. Do you have a test case that I could try?
What version of StarPU do you use?
Thanks,
Ludo’.
- Re: [Starpu-devel] CUDA task and GCC plugin, Ludovic Courtès, 21/08/2012
Archives gérées par MHonArc 2.6.19+.