Accéder au contenu.
Menu Sympa

starpu-devel - [Starpu-devel] StarPU's limitations

Objet : Developers list for StarPU

Archives de la liste

[Starpu-devel] StarPU's limitations


Chronologique Discussions 
  • From: Chris Hennick <christopherhe@trentu.ca>
  • To: starpu-devel@lists.gforge.inria.fr
  • Subject: [Starpu-devel] StarPU's limitations
  • Date: Tue, 21 Feb 2012 04:58:04 -0500
  • Authentication-results: mr.google.com; spf=pass (google.com: domain of seahen123@gmail.com designates 10.152.110.234 as permitted sender) smtp.mail=seahen123@gmail.com; dkim=pass header.i=seahen123@gmail.com
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

Hi all,

I ran across StarPU earlier this evening, while doing research for my master's thesis. My original plan for the thesis project was to create a system (which I gave the working title "HPHPU", for Heterogeneous Processes on Heterogeneous Processing Units) that could automatically optimize the assignment of multiple tasks across a heterogeneous computing system. Ideally, it was to be scalable from smart phones (where plenty of GPGPU potential already goes untapped) to distributed exascale clusters, and ready for a future explosion of new accelerator paradigms (since that may be the only way to keep processing power growing exponentially, once transistor density reaches its limit).

Having seen StarPU, I now suspect it does most of what I wanted to do. Thus, I'm starting to think my project should instead be to study StarPU's performance, compare it with other systems, and contribute some enhancements. The impression I get from the manual is that StarPU's main limitations are these:
  • New drivers can't be added modularly, because they each have a different *_funcs[] member in struct starpu_codelet.
  • StarPU-MPI, which seems to be the only framework available for distributed systems, can't transparently handle replication or failover, and performance would probably suffer if these were implemented at a lower level (e.g. by running on top of MOSIX).
  • It can't optimize the choice of scheduling policy or performance models based on the trade-off between accuracy and overhead.
  • It can't dynamically switch an Nvidia GPU between CUDA and OpenCL (or, I assume, a CPU between OpenCL and native mode). 
  • It won't support "CPU-assisted GPGPU" (see http://arstechnica.com/business/news/2012/02/researchers-boost-processor-performance-by-getting-cpu-and-gpu-to-collaborate.ars), because parallel tasks can't specify that a particular implementation requires a shared cache and a split across multiple specific drivers (in this case, CPU and OpenCL).
  • In power-based scheduling, there's no provision for dynamically adjusting gamma (e.g. if we're on a smart-metered power grid, or a laptop that's sometimes unplugged), nor for optional best-effort tasks that aren't always worth their power cost.
  • The same StarPU instance can't securely serve multiple users.
Have I missed any or gotten any wrong? Which of them do the team consider high priorities, and which ones are being worked on? I don't want to duplicate effort, and I'd like my thesis to be useful to the "real world". Thanks in advance.

Sincerely,
Chris Hennick
Trent University
Peterborough, ON, Canada
http://softwetware.blogspot.com/




Archives gérées par MHonArc 2.6.19+.

Haut de le page