Objet : Developers list for StarPU
Archives de la liste
Re: [Starpu-devel] Strange error " <starpu_task_submit> returned unexpected value: <-19>"
Chronologique Discussions
- From: Keisuke Fukuda <fukuda@matsulab.is.titech.ac.jp>
- To: starpu-devel@lists.gforge.inria.fr
- Subject: Re: [Starpu-devel] Strange error " <starpu_task_submit> returned unexpected value: <-19>"
- Date: Wed, 4 Jul 2012 20:33:25 +0900
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Hi,
I've been trying to reproduce the bug for a few days.
I can't reproduce the exact same one, but I see something related to it.
Here's the code:
https://gist.github.com/3046791
First, to calibrate the performance model, I run the following small
shell script.
---
for N in {10..50}; do
echo N=${N}00
STARPU_CALIBRATE=1 STARPU_MAX_WORKERSIZE=4 STARPU_MIN_WORKERSIZE=4
./a.out ${N}00
done
---
(My machine has 2 C2050 GPUs and a 6-core sandy bridge CPU)
Then, starpu_perfmodel_display command gives
$ starpu_perfmodel_display -s my_test_program
performance model for cpu_impl_0
Regression : #sample = 10
Linear: y = alpha size ^ beta
alpha = 8.534080e+02
beta = -1.952183e-02
performance model for cpu_2_impl_0
Regression : #sample = 12
Linear: y = alpha size ^ beta
alpha = 2.018139e+02
beta = 9.085814e-02
performance model for cpu_3_impl_0
Regression : #sample = 25
Linear: y = alpha size ^ beta
alpha = 2.211562e+02
beta = 7.101251e-02
performance model for cpu_4_impl_0
Regression : #sample = 62
Linear: y = alpha size ^ beta
alpha = 1.070743e+02
beta = 1.479519e-01
performance model for cuda_0_impl_0
Regression : #sample = 10
Linear: y = alpha size ^ beta
alpha = 1.314121e+02
beta = -7.194192e-02
performance model for cuda_1_impl_0
Regression : #sample = 10
Linear: y = alpha size ^ beta
alpha = 4.363119e+01
beta = 4.093523e-02
(Actually, I have a small question why cuda_0 and cuda_1 show such
different performance,
but it might be environment-dependent.)
The problem is, when I change the line 86 to
86: cl.where = STARPU_CUDA;
and run the program, it hangs.
It works fine if I remove line 87 and 88.
It also works if I set STARPU_NCUDA=0, instead of editing the code.
You may say that it is inappropriate to change the line after the
performance model is once built, but I think this is necessary for
debugging in my case.
I sometimes want to run a certain task only on CPU or CUDA while development.
In my program, the "cl.where = " line can be configured with command
line options like this:
if (the-task-is-only-on-cpu) {
cl.where = STARPU_CPU;
} else {
cl.where = STARPU_CUDA | STARPU_CPU;
}
My expectation is that starpu_codelet::type and
starpu_codelet::max_parallelism are silently ignored if STARPU_CPU is
NOT specified.
Thank you.
Keisuke
On Fri, Jun 29, 2012 at 6:18 PM, Keisuke Fukuda
<fukuda@matsulab.is.titech.ac.jp> wrote:
> Ops, sorry.
>
> Actually the error occurred within my large program,
> which usually works well.
>
> I will check again the condition to cause the error and report again.
>
> Keisuke
>
> On Fri, Jun 29, 2012 at 6:16 PM, Cyril Roelandt <cyril.roelandt@inria.fr>
> wrote:
>> On 06/29/2012 10:44 AM, Keisuke Fukuda wrote:
>>> Samuel,
>>>
>>> Here I've put very small code to reproduce the error.
>>>
>>> https://gist.github.com/9c228d0b0c2aed03f71f
>>>
>>
>> You must call starpu_init() before calling any other starpu_* function.
>> This should fix your issue.
>>
>>
>> Cyril.
>>
>
>
>
> --
> ------------------------------------------
> FUKUDA, Keisuke<fukuda@matsulab.is.titech.ac.jp>
> Dept. of Math. & Comp. Sciences
> Satoshi Matsuoka Lab.,
> Tokyo Institute of Technology
>
> 福田圭祐 <fukuda@matsulab.is.titech.ac.jp>
> 東京工業大学 数理・計算科学専攻 松岡研究室
--
------------------------------------------
FUKUDA, Keisuke<fukuda@matsulab.is.titech.ac.jp>
Dept. of Math. & Comp. Sciences
Satoshi Matsuoka Lab.,
Tokyo Institute of Technology
福田圭祐 <fukuda@matsulab.is.titech.ac.jp>
東京工業大学 数理・計算科学専攻 松岡研究室
- Re: [Starpu-devel] Strange error " <starpu_task_submit> returned unexpected value: <-19>", Keisuke Fukuda, 04/07/2012
Archives gérées par MHonArc 2.6.19+.