Accéder au contenu.
Menu Sympa

starpu-devel - Re: [Starpu-devel] Fwd: Strong interest in contributions and integrations

Objet : Developers list for StarPU

Archives de la liste

Re: [Starpu-devel] Fwd: Strong interest in contributions and integrations


Chronologique Discussions 
  • From: Fangli Pi <hpcfapix@hlrs.de>
  • To: Andra Hugo <andra.hugo@inria.fr>
  • Cc: starpu-devel <starpu-devel@lists.gforge.inria.fr>, Dennis Hoppe <dennis.hoppe@hlrs.de>, Dmitry Khabi <khabi@hlrs.de>, Michael Gienger <gienger@hlrs.de>
  • Subject: Re: [Starpu-devel] Fwd: Strong interest in contributions and integrations
  • Date: Mon, 23 May 2016 17:26:01 +0200 (CEST)
  • Authentication-results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=hpcfapix@hlrs.de; spf=None smtp.mailfrom=hpcfapix@hlrs.de; spf=None smtp.helo=postmaster@mail.hlrs.de
  • Dkim-filter: OpenDKIM Filter v2.9.2 mail.hlrs.de ECFCD848B
  • Ironport-phdr: 9a23:kz2O6hweve5xseLXCy+O+j09IxM/srCxBDY+r6Qd0ewQIJqq85mqBkHD//Il1AaPBtWKrakVwLqI+4nbGkU+or+5+EgYd5JNUxJXwe43pCcHRPC/NEvgMfTxZDY7FskRHHVs/nW8LFQHUJ2mPw6anHS+4HYoFwnlMkItf6KuSt+U1JX8h7760qaQSjsLrQL1Wal1IhSyoFeZnegtqqwmFJwMzADUqGBDYeVcyDAgD1uSmxHh+pX4p8Y7oGx48sgs/M9YUKj8Y79wDfkBVGxnYCgJ45jCsxXZREOv+3QbGjEflBZSCk7F8R79dpb3qCrz8ORnjnq0J8rzGJ49Xzum7u9TSVe8iD8MOTch8XH/ishql+RXrUTy9FRE34fIbdTNZ7JFdaTHcIZfHDIZUw==
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hi Andra,

Thanks very much for your reply. Although I cannot answer all the questions
right now, they are all very valuable and we will take time to try to
implement towards the direction.

Following what I can answer now:

1. I did feed energy in Joule back to the profiling_info->power_consumed.
Because in StarPU handbook, it is written that "double
starpu_task_expected_power() returns expected power consumption in J", and I
thought that they need to be consistent. (Am I right?)

2. I didn't use directly the time (start_time, submit_time, etc.) in the
profiling_info because these times are relative to the initialization time of
StarPU, however I need the absolute date and time. If StarPU could provide
this, it would be great.

3, 4, 5, 6. It is helpful to get these reviews and I will make changes
accordingly in soon.

7. The calls to sleep is due to the delay of communication and the database.
I will try to reduce these by increase the update rate of the database.

8. The monitoring framework we use is a light-weight and near-real time tool.
There is one paper introducing it and one paragraph about overhead measuring
is included. Please find the paper in the attachment. Since the monitoring
framework is still under development, its updated overhead (for papi-based
plugin) is reduced from 2.0%+ to 1.7%+ for an update rate of 100ms, and from
18.1% to 14.2% for an update rate of 10ms.

9. It is true that the energy measurement cannot separate for different CPU
cores, so that if multiple tasks are executed in parallel, the finally
retrieved energy data is the energy consumed for all CPU and GPU. One
possible solution could be that a task which power data are profiled and
power model is calibrating is considered to be in a training mode rather than
an executing mode, and the tasks can only be executed asynchronously. Once
the tasks are calibrated, it is possible to execute them in parallel.

10. This is an interesting point, which we haven't thought before. Maybe we
can have a study of starpu threads first and see how hard it is to modify the
monitoring framework according. It would also be nice if you could provide
more info in this direction.

All in all, I will update the repo according to your reviews and I will let
you know once it is ready.

Best regards,
Fangli

----- Original Message -----
From: "Andra Hugo" <andra.hugo@inria.fr>
To: "Fangli Pi" <hpcfapix@hlrs.de>
Cc: "starpu-devel" <starpu-devel@lists.gforge.inria.fr>, "Dennis Hoppe"
<dennis.hoppe@hlrs.de>, "Dmitry Khabi" <khabi@hlrs.de>, "Michael Gienger"
<gienger@hlrs.de>
Sent: Monday, May 23, 2016 12:07:12 PM
Subject: Re: [Starpu-devel] Fwd: Strong interest in contributions and
integrations

Hi Fangli,

I looked into the code you provided and I have a few suggestions/questions. I
will separate them in 2 parts. Let me know if it makes sense and if I can
help with anything.

A. Code reviewing

1. Do you measure energy or power? Maybe the field names don't match
properly(you store the energy value in the power field).
2. The task already has a structure called profiling_info that has different
fields (start_time, submit_time, etc.), no need to add one per job
3. We have a source file with some general functions used by all the drivers
driver_common.c, here we update all the profiling information, and it would
be a more appropriate place to update the energy and avoid code replication.
4. It would be good to have a configure option that activates this
measurements when the user requires them
5. It would be maybe better to integrate the code of ex_starpu_ini in
starpu_profiling_init. I am however a little bit worried about interfering
with the application's cmd line parameters. We usually use env variables for
runtime options for starpu.
6. It might be unnecessary overhead to count all the sockets at each time you
measure the power, they won't change their number at runtime (I don't think
starpu deals with any fault tolerance problems).
7. In starpu_ex.c there are some calls to sleep, are they really necessary?

B. Performance

8. Do you have some measurements or a paper that shows the overhead of your
approach? The client-server communication seams a lot of overhead added
exactly when we pop tasks which is the most performance sensitive part of
starpu.

9. It seams that you measure the power of all CPU cores and GPUs after the
execution of each task (ex_starpu_get_power(j->ex_start_ts, j->ex_end_ts,
ALL_POWER))). Am I correct? How does this work for a parallel application,
when anther processor executes another task?

10. This is more like a suggestion: In order to monitor the processors you
create a set of threads to take care of this. StarPU also has its own set of
threads. Leaving the OS deal with the context switch might affect
performance. Did you consider maybe using starpu's threads instead? I
understand that your approach client server is definitely a better software
engineering solution, and it can then be easily integrated in any software
without modifying the code, but the overhead might be significant.

That's all I can see for now. Please let me know if you have any questions or
if I can help in any way.

Best,
Andra

----- Mail original -----
> De: "Fangli Pi" <hpcfapix@hlrs.de>
> À: starpu-devel@lists.gforge.inria.fr
> Cc: "Dennis Hoppe" <dennis.hoppe@hlrs.de>, "Dmitry Khabi" <khabi@hlrs.de>,
> "Michael Gienger" <gienger@hlrs.de>
> Envoyé: Lundi 9 Mai 2016 12:15:03
> Objet: [Starpu-devel] Fwd: Strong interest in contributions and
> integrations
>
> Dear StarPU devel-group,
>
> Sorry for bothering you again with my mail sent two weeks ago.
>
> Being afraid that there will be delays due to mail barriers or holidays
> during these two weeks, I would here restate my deep interest in
> contributing in StarPU by our project team.
>
> In these two weeks we have succeed in integrating the energy measurements in
> StarPU (version 1.2.0rc5) for cpu drivers (first step towards the
> integration). Changes based on StarPU source can be found in the following
> link:
> http://gitlab.excess-project.eu/hpcfapix/starpu-ex-1-2-0rc5.git
>
> We are now keeping the development private and it would be grateful to get
> your permission to make it public after all development is done. It is still
> of great interest by our side to merge the extension into StarPU and
> hopefully it would also be an usable feature for StarPU.
>
> Sincerely looking forward to your reply.
>
> Best regards,
> Fangli Pi
>
> ----- Forwarded Message -----
> From: "Fangli Pi" <hpcfapix@hlrs.de>
> To: "starpu-devel" <starpu-devel@lists.gforge.inria.fr>
> Cc: "Michael Gienger" <gienger@hlrs.de>, "Dennis Hoppe"
> <dennis.hoppe@hlrs.de>, "Dmitry Khabi" <khabi@hlrs.de>
> Sent: Monday, April 25, 2016 10:27:27 AM
> Subject: Strong interest in contributions and integrations
>
> Dear StarPU devel-group,
>
> I am now working on an EU project (EXCESS, http://excess-project.eu/) using
> StarPU as a runtime system. For the project's concentration on
> energy-efficient computation, we have tried to extend StarPU power model to
> retrieve energy and power measurements from one monitoring framework,
> developed by our team.
>
> Following is the link of our StarPU power extension, including the source
> code, some examples and a basic introduction.
> https://github.com/excess-project/starpu-energy-aware-extension.git
>
> As we are currently thinking about an integration with StarPU, it would be
> very beneficial to get your professional opinions and relevant supports upon
> the aspect of StarPU.
>
> Looking forward to your reply and a possible cooperation in the future.
>
> Best regards,
> Fangli Pi
>
> --
> High Performance Computing Center (HLRS)
> University of Stuttgart
> Nobelstr. 19
> D-70569 Stuttgart, Germany
>
> phone: ++49-711-685-60442
> --
> High Performance Computing Center (HLRS)
> University of Stuttgart
> Nobelstr. 19
> D-70569 Stuttgart, Germany
>
> phone: ++49-711-685-60442
> _______________________________________________
> Starpu-devel mailing list
> Starpu-devel@lists.gforge.inria.fr
> http://lists.gforge.inria.fr/mailman/listinfo/starpu-devel
>
--
High Performance Computing Center (HLRS)
University of Stuttgart
Nobelstr. 19
D-70569 Stuttgart, Germany

phone: ++49-711-685-60442

Attachment: feedback-monitoring-framework-2015.pdf
Description: Adobe PDF document




Archives gérées par MHonArc 2.6.19+.

Haut de le page