Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] StarPU+SimGrid: FetchingInput computation

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] StarPU+SimGrid: FetchingInput computation


Chronologique Discussions 
  • From: Mirko Myllykoski <mirkom@cs.umu.se>
  • To: Luka Stanisic <luka.stanisic@inria.fr>
  • Cc: starpu-devel@lists.gforge.inria.fr
  • Subject: Re: [Starpu-devel] StarPU+SimGrid: FetchingInput computation
  • Date: Wed, 14 Dec 2016 10:22:35 +0100
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=mirkom@cs.umu.se; spf=Pass smtp.mailfrom=mirkom@cs.umu.se; spf=None smtp.helo=postmaster@mail.cs.umu.se
  • Ironport-phdr: 9a23:SFifkxyDmzimtxfXCy+O+j09IxM/srCxBDY+r6Qd2ugTIJqq85mqBkHD//Il1AaPBtSArawZwLSI+4nbGkU4qa6bt34DdJEeHzQksu4x2zIaPcieFEfgJ+TrZSFpVO5LVVti4m3peRMNQJW2aFLduGC94iAPERvjKwV1Ov71GonPhMiryuy+4ZPebgFGiTanYL5/LBq6oATQu8ILnYZsN6E9xwfTrHBVYepW32RoJVySnxb4+Mi9+YNo/jpTtfw86cNOSL32cKskQ7NWCjQmKH0169bwtRbfVwuP52ATXXsQnxFVHgXK9hD6XpP2sivnqupw3TSRMMPqQbwoXzmp8rxmQwH0higZKzE58XnXis1ug6JdvBKhvAF0z4rNbI2IKPZyYqbRcNUdSmRcQ8ZRTDRBDZ2gYIsTAeQBOuNYoJP8p1sIsBCyAQahCefqxjJOm3T437A10/45HA7JwAMuENwAvnTKotvoNqgSX+O7zabGwjjYc/Nb2jX95JTUfh07o/yBW697f8rLyUkoEgPIllKeqY3/PzOW0eQGrm+V7upkVeKri28nqAZxoiOsxsctl4LEgYcVylHe9SV6x4Y1IMa1R1JgYdK+DZtQsjyaO5FtQsw8WmxlvjsxxLMBuZ6+ZicKyZInygbea/yBaYiI4wjsWPyfITdinH5qZai/hxGq8Ue7ze39WdS00E1UoSpfjtbMsXYN1wDN5ciBVvtx5Fuu2TGK1w3V9+pKIlg0mLLGJ5Mv3rI8jIQfvErHEyPsl0j6kLWaelsk9+Sw9ejrfqnqqoKZOoJ1kQ7yLqEjldK6DOk3LgQDUHSX9OG52bH+4EH0RqhBgOcsnanDqp/aINwWpq6nDA9R1YYu8xO/AC2n0NQch3UIMElFdAiaj4jsJlHCOuv4Aumnj1Stljdk2ezGM6X8DpjDMnTPiqntcLh55kJGxwc/0cpT64xUB70ZJfL8QE7xtNjWDh8jNAy0xv7qCNB81oMEWmKPBaqZPbjOvl+T/O4vPuyMZJIPtDbmNfcp/+TugmMhmV8BYamp2oMaaGukHvt8OUWZeWfsjs4cHmgUoAoxUujqhUaGUT5SfHayQ7k86is0CIKoCofDXI+tj6Kb0Ce6GJ1WfGFGBUqWHXfmbYXXE8sLPQaPIcYpqSANU7m6SoRpgRWztwa80KZuKOvP+yYwtJT51dEz6feFxj8o8jkhKsWH0GbFanBwmnILQCQ13egrpE1nyVCG+aNjxeFdCJpI6qUaAU8BKZfAwrkiWJjJUQXbc4LMFQ+r
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hi Luka,

Just to clarify my previous email. I do not doubt SimGrid's ability to accurately predict execution times in general. However, the numerical code I am developing has some features which I believe will make the simulation harder (large number of small tasks, complicated data dependencies, varying sensitivity to parameter value changes, etc).

Best Regards,
Mirko

On 2016-12-14 09:52, Mirko Myllykoski wrote:
Hi Luka,

Here are the two paje traces you requested:

https://dl.dropboxusercontent.com/u/1521774/paje.trace.tar.gz
https://dl.dropboxusercontent.com/u/1521774/paje-simgrid.trace.tar.gz

I must say that the way I am using the starpu_perfmodel::size_base
field is a bit unorthodox. That's why I ran the simulation twice, once
with the size_base field and once without it. My ultimate intention is
to autotune my code using an external black box software. However, the
execution time may vary from a few seconds to hours depending on the
input data and various parameters. I hope that SimGrid would help with
this problem by saving countless CPU hours.

Right now, I am trying to figure out whether this idea is feasible. I
realized that a linear regression model can predict the codelet
execution times quite accurately provided that the input data does not
change too much and parameters are kept constants. These regression
models cannot be used to autotune anything but if SimGrid fails the
predict the total execution time in this simple case, then I can be
quite sure that this overall idea does not work and I should try
something else instead.

Best Regards,
Mirko

On 2016-12-13 16:40, Luka Stanisic wrote:
Hi Mirko,

Indeed, I was wondering if your platform has any GPUs, but as you said
it is a simple 4 cores machine. Adding more CPUs or GPUs in future
shouldnt be a problem.

You are right, SimGrid shouldnt include any significant fetching time
to the simulation since everything is running with shared memory.
However, appearance of FetchingInput state in the traces is possible,
since StarPU is passing through many parts of the code. Still, the
duration of FetchingInput should be negligible.

Could you please share two paje.trace traces (one for real execution
and one for SimGrid), so I can try to understand better what is
happening? If the traces are big (>100MB), it might be better to run
your application with smaller problem size (if possible).


Also from what I have seen, you are using STARPU_REGRESSION_BASED or
STARPU_NL_REGRESSION_BASED performance models for your codelets,
right? Is this something that you need for your application?
Personally, I have never tried to simulate applications using these
models, although I dont see any reason why it shouldnt work. The
starpu_perfmodel::size_base field is actually used by these models,
more information is available here:
http://starpu.gforge.inria.fr/doc/html/OnlinePerformanceTools.html


So my first guess is that you are somehow using codelet perfmodels and
their size_base incorrectly (or there is an unknown bug in StarPU or
StarPU+SimGrid code), which makes simulation longer than expected.
Then, in the traces this is manifested as long FetchingInputs,
although fetching inputs have nothing to do with the actual problem.


Best regards,
Luka
On 13/12/2016 14:34, Mirko Myllykoski wrote:
Hi Luka,

and thank you for your reply.

I performed the same experiment twice, once with the size_base field included and once without it. I erased the samples directory before each experiment and gave it a few rounds to calibrate properly (STARPU_CALIBRATE=1). Here are the corresponding sample folders:

https://dl.dropboxusercontent.com/u/1521774/sampling_with_size_base.tar.gz https://dl.dropboxusercontent.com/u/1521774/sampling_without_size_base.tar.gz In this case, the error seems to be about 35%.

As I mentioned in my previous email, the code is shared memory only (at the moment). I performed the experiment on my local machine (quad-core i5) but my plan is to move on to a bigger machine (28 or 42 cores per node) and distributed memory once everything works.

I don't quite understand why SimGrid would include any fetching time to the simulation since everything is running in shared memory.

Best Regards,
Mirko

On 2016-12-12 18:21, Luka Stanisic wrote:
Hello Mirko,

Indeed, 50% prediction error is quite big and it suggests that
something is probably not correctly configured. Could you please send
us a compressed version of you ".starpu/sampling" folder, the one from
which simulation will read the performance models. This can help us
get the first ideas of the machine and application you are trying to
simulate.

To answer your question, the fetching time is computed based on the
size of the data being transfered, latency and bandwidth of the link
(in machine.platform.xml file) and the possible contention due to
other transfers occurring in parallel.

Best regards,
Luka

On 07/12/2016 12:50, Mirko Myllykoski wrote:
Hi,

my name is Mirko Myllykoski and I work as a PostDoc researcher for the NLAFET project at Umeå University.

I am currently implementing a (shared memory) numerical software using StarPU and I am trying to simulate my code using SimGrid. However, I noticed that the simulated execution time is way off (about 50%). I checked the generated FxT traces using vite and it seems that SimGrid introduces too much fetching time (State: FetchingInput) to the simulation.

How is this fetching time being computed? My performance models include the starpu_perfmodel::size_base data field and I guess that information is somehow used to compute the fetch time.

Best Regards,
Mirko Myllykoski
_______________________________________________
Starpu-devel mailing list
Starpu-devel@lists.gforge.inria.fr
http://lists.gforge.inria.fr/mailman/listinfo/starpu-devel




Archives gérées par MHonArc 2.6.19+.

Haut de le page