Accéder au contenu.
Menu Sympa

starpu-devel - [Starpu-devel] starpu v1.1.0 released

Objet : Developers list for StarPU

Archives de la liste

[Starpu-devel] starpu v1.1.0 released


Chronologique Discussions 
  • From: Nathalie Furmento <nathalie.furmento@labri.fr>
  • To: starpu-announce@lists.gforge.inria.fr
  • Cc: starpu-devel@lists.gforge.inria.fr
  • Subject: [Starpu-devel] starpu v1.1.0 released
  • Date: Wed, 18 Dec 2013 13:33:42 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel>
  • List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>

All,

The StarPU team is pleased to announce the release of v1.1.0, the scheduling
context release.

https://gforge.inria.fr/frs/?group_id=1570

This release notably brings the concept of scheduling contexts which allows
to separate computation resources.

If you have not tested the different rcs, the public API has been modified
compared to the 1.0.x releases. The script tools/dev/rename.sh is provided to
update your existing applications to use the new names. It is also possible
to compile with the pkg-config package starpu-1.0 to keep using the old
names. It is however recommended to update your code and to use the package
starpu-1.1.

Please test and let us know if you find any issue.

Nathalie, for the StarPU team.

==============================================
StarPU 1.1.0 (svn revision 11960)
The scheduling context release

New features:
* OpenGL interoperability support.
* Capability to store compiled OpenCL kernels on the file system
* Capability to load compiled OpenCL kernels
* Performance models measurements can now be provided explicitly by
applications.
* Capability to emit communication statistics when running MPI code
* Add starpu_unregister_submit, starpu_data_acquire_on_node and
starpu_data_invalidate_submit
* New functionnality to wrapper starpu_insert_task to pass a array of
data_handles via the parameter STARPU_DATA_ARRAY
* Enable GPU-GPU direct transfers.
* GCC plug-in
- Add `registered' attribute
- A new pass was added that warns about the use of possibly
unregistered memory buffers.
* SOCL
- Manual mapping of commands on specific devices is now
possible
- SOCL does not require StarPU CPU tasks anymore. CPU workers
are automatically disabled to enhance performance of OpenCL
CPU devices
* New interface: COO matrix.
* Data interfaces: The pack operation of user-defined data interface
defines a new parameter count which should be set to the size of
the buffer created by the packing of the data.
* MPI:
- Communication statistics for MPI can only be enabled at
execution time by defining the environment variable
STARPU_COMM_STATS
- Communication cache mechanism is enabled by default, and can
only be disabled at execution time by setting the
environment variable STARPU_MPI_CACHE to 0.
- Initialisation functions starpu_mpi_initialize_extended()
and starpu_mpi_initialize() have been made deprecated. One
should now use starpu_mpi_init(int *, char ***, int). The
last parameter indicates if MPI should be initialised.
- Collective detached operations have new parameters, a
callback function and a argument. This is to be consistent
with the detached point-to-point communications.
- When exchanging user-defined data interfaces, the size of
the data is the size returned by the pack operation, i.e
data with dynamic size can now be exchanged with StarPU-MPI.
* Add experimental simgrid support, to simulate execution with various
number of CPUs, GPUs, amount of memory, etc.
* Add support for OpenCL simulators (which provide simulated execution time)
* Add support for Temanejo, a task graph debugger
* Theoretical bound lp output now includes data transfer time.
* Update OpenCL driver to only enable CPU devices (the environment
variable STARPU_OPENCL_ONLY_ON_CPUS must be set to a positive
value when executing an application)
* Add Scheduling contexts to separate computation resources
- Scheduling policies take into account the set of resources
corresponding
to the context it belongs to
- Add support to dynamically change scheduling contexts
(Create and Delete a context, Add Workers to a context, Remove
workers from a context)
- Add support to indicate to which contexts the tasks are submitted
* Add the Hypervisor to manage the Scheduling Contexts automatically
- The Contexts can be registered to the Hypervisor
- Only the registered contexts are managed by the Hypervisor
- The Hypervisor can detect the initial distribution of resources of
a context and constructs it consequently (the cost of execution is
required)
- Several policies can adapt dynamically the distribution of resources
in contexts if the initial one was not appropriate
- Add a platform to implement new policies of redistribution
of resources
* Implement a memory manager which checks the global amount of
memory available on devices, and checks there is enough memory
before doing an allocation on the device.
* Discard environment variable STARPU_LIMIT_GPU_MEM and define
instead STARPU_LIMIT_CUDA_MEM and STARPU_LIMIT_OPENCL_MEM
* Introduce new variables STARPU_LIMIT_CUDA_devid_MEM and
STARPU_LIMIT_OPENCL_devid_MEM to limit memory per specific device
* Introduce new variable STARPU_LIMIT_CPU_MEM to limit memory for
the CPU devices
* New function starpu_malloc_flags to define a memory allocation with
constraints based on the following values:
- STARPU_MALLOC_PINNED specifies memory should be pinned
- STARPU_MALLOC_COUNT specifies the memory allocation should be in
the limits defined by the environment variables STARPU_LIMIT_xxx
(see above). When no memory is left, starpu_malloc_flag tries
to reclaim memory from StarPU and returns -ENOMEM on failure.
* starpu_malloc calls starpu_malloc_flags with a value of flag set
to STARPU_MALLOC_PINNED
* Define new function starpu_free_flags similarly to starpu_malloc_flags
* Define new public API starpu_pthread which is similar to the
pthread API. It is provided with 2 implementations: a pthread one
and a Simgrid one. Applications using StarPU and wishing to use
the Simgrid StarPU features should use it.
* Allow to have a dynamically allocated number of buffers per task,
and so overwrite the value defined --enable-maxbuffers=XXX
* Performance models files are now stored in a directory whose name
include the version of the performance model format. The version
number is also written in the file itself.
When updating the format, the internal variable
_STARPU_PERFMODEL_VERSION should be updated. It is then possible
to switch easily between differents versions of StarPU having
different performance model formats.
* Tasks can now define a optional prologue callback which is executed
on the host when the task becomes ready for execution, before getting
scheduled.
* Small CUDA allocations (<= 4MiB) are now batched to avoid the huge
cudaMalloc overhead.
* Prefetching is now done for all schedulers when it can be done whatever
the scheduling decision.
* Add a watchdog which permits to easily trigger a crash when StarPU gets
stuck.
* Document how to migrate data over MPI.
* New function starpu_wakeup_worker() to be used by schedulers to
wake up a single worker (instead of all workers) when submitting a
single task.
* The functions starpu_sched_set/get_min/max_priority set/get the
priorities of the current scheduling context, i.e the one which
was set by a call to starpu_sched_ctx_set_context() or the initial
context if the function has not been called yet.

Small features:
* Add starpu_worker_get_by_type and starpu_worker_get_by_devid
* Add starpu_fxt_stop_profiling/starpu_fxt_start_profiling which permits to
pause trace recording.
* Add trace_buffer_size configuration field to permit to specify the tracing
buffer size.
* Add starpu_codelet_profile and starpu_codelet_histo_profile, tools which
draw
the profile of a codelet.
* File STARPU-REVISION --- containing the SVN revision number from which
StarPU was compiled --- is installed in the share/doc/starpu directory
* starpu_perfmodel_plot can now directly draw GFlops curves.
* New configure option --enable-mpi-progression-hook to enable the
activity polling method for StarPU-MPI.
* Permit to disable sequential consistency for a given task.
* New macro STARPU_RELEASE_VERSION
* New function starpu_get_version() to return as 3 integers the
release version of StarPU.
* Enable by default data allocation cache
* New function starpu_perfmodel_directory() to print directory
storing performance models. Available through the new option -d of
the tool starpu_perfmodel_display
* New batch files to execute StarPU applications under Microsoft
Visual Studio (They are installed in path_to_starpu/bin/mvsc)/
* Add cl_arg_free, callback_arg_free, prologue_callback_arg_free fields to
enable automatic free(cl_arg); free(callback_arg);
free(prologue_callback_arg) on task destroy.
* New function starpu_task_build

Changes:
* Rename all filter functions to follow the pattern
starpu_DATATYPE_filter_FILTERTYPE. The script
tools/dev/rename_filter.sh is provided to update your existing
applications to use new filters function names.
* Renaming of diverse functions and datatypes. The script
tools/dev/rename.sh is provided to update your existing
applications to use the new names. It is also possible to compile
with the pkg-config package starpu-1.0 to keep using the old
names. It is however recommended to update your code and to use
the package starpu-1.1.

* Fix the block filter functions.
* Fix StarPU-MPI on Darwin.
* The FxT code can now be used on systems other than Linux.
* Keep only one hashtable implementation common/uthash.h
* The cache of starpu_mpi_insert_task is fixed and thus now enabled by
default.
* Improve starpu_machine_display output.
* Standardize objects name in the performance model API
* SOCL
- Virtual SOCL device has been removed
- Automatic scheduling still available with command queues not
assigned to any device
- Remove modified OpenCL headers. ICD is now the only supported
way to use SOCL.
- SOCL test suite is only run when environment variable
SOCL_OCL_LIB_OPENCL is defined. It should contain the location
of the libOpenCL.so file of the OCL ICD implementation.
* Fix main memory leak on multiple unregister/re-register.
* Improve hwloc detection by configure
* Cell:
- It is no longer possible to enable the cell support via the
gordon driver
- Data interfaces no longer define functions to copy to and from
SPU devices
- Codelet no longer define pointer for Gordon implementations
- Gordon workers are no longer enabled
- Gordon performance models are no longer enabled
* Fix data transfer arrows in paje traces
* The "heft" scheduler no longer exists. Users should now pick "dmda"
instead.
* StarPU can now use poti to generate paje traces.
* Rename scheduling policy "parallel greedy" to "parallel eager"
* starpu_scheduler.h is no longer automatically included by
starpu.h, it has to be manually included when needed
* New batch files to run StarPU applications with Microsoft Visual C
* Add examples/release/Makefile to test StarPU examples against an
installed version of StarPU. That can also be used to test
examples using a previous API.
* Tutorial is installed in ${docdir}/tutorial
* Schedulers eager_central_policy, dm and dmda no longer erroneously respect
priorities. dmdas has to be used to respect priorities.
* StarPU-MPI: Fix potential bug for user-defined datatypes. As MPI
can reorder messages, we need to make sure the sending of the size
of the data has been completed.
* Documentation is now generated through doxygen.
* Modification of perfmodels output format for future improvements.
* Fix for properly dealing with NAN on windows systems
* Function starpu_sched_ctx_create() now takes a variable argument
list to define the scheduler to be used, and the minimum and
maximum priority values
* The functions starpu_sched_set/get_min/max_priority set/get the
priorities of the current scheduling context, i.e the one which
was set by a call to starpu_sched_ctx_set_context() or the initial
context if the function was not called yet.

Small changes:
* STARPU_NCPU should now be used instead of STARPU_NCPUS. STARPU_NCPUS is
still available for compatibility reasons.
* include/starpu.h includes all include/starpu_*.h files, applications
therefore only need to have #include <starpu.h>
* Active task wait is now included in blocked time.
* Fix GCC plugin linking issues starting with GCC 4.7.
* Fix forcing calibration of never-calibrated archs.
* CUDA applications are no longer compiled with the "-arch sm_13"
option. It is specifically added to applications which need it.
* Explicitly name the non-sleeping-non-running time "Overhead", and use
another color in vite traces.
* Use C99 variadic macro support, not GNU.
* Fix performance regression: dmda queues were inadvertently made
LIFOs in r9611.





  • [Starpu-devel] starpu v1.1.0 released, Nathalie Furmento, 18/12/2013

Archives gérées par MHonArc 2.6.19+.

Haut de le page