Objet : Developers list for StarPU
Archives de la liste
- From: Johan Mazel <johan.mazel@gmail.com>
- To: Samuel Thibault <samuel.thibault@inria.fr>, Johan Mazel <johan.mazel@gmail.com>, starpu-devel@lists.gforge.inria.fr
- Subject: Re: [Starpu-devel] Implementation question regarding output type
- Date: Thu, 16 Jan 2020 00:05:55 +0100
- Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=johan.mazel@gmail.com; spf=Pass smtp.mailfrom=johan.mazel@gmail.com; spf=None smtp.helo=postmaster@mail-pg1-f195.google.com
- Ironport-phdr: 9a23:iZ4/Kx8Gsv6iff9uRHKM819IXTAuvvDOBiVQ1KB20+0cTK2v8tzYMVDF4r011RmVBNmds64P07Ce8/i5HzBZutDZ6DFKWacPfidNsd8RkQ0kDZzNImzAB9muURYHGt9fXkRu5XCxPBsdMs//Y1rPvi/6tmZKSV3wOgVvO+v6BJPZgdip2OCu4Z3TZBhDiCagbb9oIxi6sArcutMKjYZtJao91gXFqWZMd+hK2G9kP12ekwvy68uq4JJv7yFcsO89+sBdVqn3Y742RqFCAjQ8NGA16szrtR3dQgaK+3ARTGYYnAdWDgbc9B31UYv/vSX8tupmxSmVJtb2QqwuWTSj9KhkVhnlgzoaOjEj8WHXjstwjL9HoB+kuhdyzZLYbJ2TOfFjZa7WY88USnRdUcZQTyxBA52zb40TD+oaIO1Uq5Dxq0YSoReiAAWhAv7kxD1ViX/sxaA13OohHgPG0gIuHNwArWrao8nuOacXTey41rPFwSnfY/5U3zr29YjGcgomofGJRb9+dc3RyUk1GAPDk16erpbqPjKL2eQJrmOW6OhgVeWvi2E9rQF9uD2vyd0ti4bXgoIY0V/E9SBjz4Y0Id20UlJ0YdmhEJZJsSyRKoh4Qts6Tm12pCo3zqcKtJ27cSQQ1pgr2h7SZ+aaf4WM4h/uUvuaLy1ii3J/Yr2/gg6/8Ui+xe34Ucm5yFNKoTBEktnIr3wNzxPT5tWeRvtz40us3TaC2xrc6uFDJkA0mq7bJIA7zrEskZoTtFzPHi7wmErokK+bblso9vSs5uj9YbjrpoWQO5Fphgz+KKgih8iyDOQgPggLRWeb+OC81LP5/U3+RbVHlv82kq7ZsJ/AI8QXvKu5DBVU04k97xayFDim0NECknkGKFJJYg6Ij4/sO13WOvD3Ee+/g0iwkDds3/3GJaHhDY/XLnjbjbftZKty5FBCyAUtydBS/JZUCrAaIPLvQU/9rtPYDhE+MwyuzOboFs9x1oIYWWKVA6+WKrnesVGS5rFnH+7ZX4YQpTv5Y8Ql5vTnkHsl0QsGdKy00J9RdHC5FP17J1mxYHz2g95HH31c+kIcS+zjjhWnWCxeYz7mXaU87zd9DYu0AIPrSYWkgbjH1yC+SNkefXxPEEiRVHvlaYiAc/MNcz6JZMBvlSYLWP6gTZUg3FegrlzU0b1ie8jT+iQe/bfqxtVxr7nRkhE783pxBtaQ10mCSmh1miUDQDpgj/M3mlB01lrWifswuPdfD9EGoqoRCl5mZ66Z9PRzDpXJYiyEZs2AEQ/0TdCvADV3RdU0kYdXPhRNXu66hxWG5BKERr8Yk7vRWs4x+6PYmmD+foNzkiebkqYmiFYiT41EMmj03vcupTiWPJbAlgCir4jvcK0d2CDX82LalDiBuUhZVEh7VqCXBH0=
- List-archive: <http://lists.gforge.inria.fr/pipermail/starpu-devel/>
- List-id: "Developers list. For discussion of new features, code changes, etc." <starpu-devel.lists.gforge.inria.fr>
Hello
So here is the minimal example where I try to use STARPU_VECTOR_SET_NX in a CUDA codelet.
The initial data is a vector where values are equal to index/position in the vector. Eg, a vector of length 8 will [0,1,2,3,4,5,6,7].
The program is adding 100 to each odd element. The result is [0,101,2,103,4,105,6,107].
I want to only bring back to the main memory the nth first element of the partitioned vector. In the attached example n is picked randomly. But, in my real use case, this n depends on some result of the CUDA kernel code.
If I understand correctly the way CUDA codelet are structured, STARPU_VECTOR_SET_NX should be called at the end of the function that calls the CUDA kernel. It cannot be called inside the kernel because (I paraphrase nvcc) a __host__ function cannot be called from a __global__ function.
In the example I attached, I tried to create a counter (to be used in STARPU_VECTOR_SET_NX) for each task that the CUDA kernel can update. These counters are stored in a vector whose length is equal to the number of tasks.
It however looks like I cannot access the content of this vector from the function that calls the kernel. I guess that it is because pointers obtained with STARPU_VECTOR_GET_PTR yield pointers in the GPU memory, not the main memory.
Is there any other way to extract some results from the CUDA kernel to the calling function? Maybe a two steps method: first, get counters from the GPU memory, then, set the number of values to move from the GPU memory to the main memory using these counters?
Thank you very much for your time.
Best regards,
Johan
So here is the minimal example where I try to use STARPU_VECTOR_SET_NX in a CUDA codelet.
The initial data is a vector where values are equal to index/position in the vector. Eg, a vector of length 8 will [0,1,2,3,4,5,6,7].
The program is adding 100 to each odd element. The result is [0,101,2,103,4,105,6,107].
I want to only bring back to the main memory the nth first element of the partitioned vector. In the attached example n is picked randomly. But, in my real use case, this n depends on some result of the CUDA kernel code.
If I understand correctly the way CUDA codelet are structured, STARPU_VECTOR_SET_NX should be called at the end of the function that calls the CUDA kernel. It cannot be called inside the kernel because (I paraphrase nvcc) a __host__ function cannot be called from a __global__ function.
In the example I attached, I tried to create a counter (to be used in STARPU_VECTOR_SET_NX) for each task that the CUDA kernel can update. These counters are stored in a vector whose length is equal to the number of tasks.
It however looks like I cannot access the content of this vector from the function that calls the kernel. I guess that it is because pointers obtained with STARPU_VECTOR_GET_PTR yield pointers in the GPU memory, not the main memory.
Is there any other way to extract some results from the CUDA kernel to the calling function? Maybe a two steps method: first, get counters from the GPU memory, then, set the number of values to move from the GPU memory to the main memory using these counters?
Thank you very much for your time.
Best regards,
Johan
Le ven. 20 déc. 2019 à 19:41, Johan Mazel <johan.mazel@gmail.com> a écrit :
Ok, I will write a minimal example and send it on this thread.Best regards,JohanLe ven. 20 déc. 2019 à 19:37, Samuel Thibault <samuel.thibault@inria.fr> a écrit :Johan Mazel, le ven. 20 déc. 2019 19:34:44 +0100, a ecrit:
> But my CUDA kernel has several error (some are related to error message display
> on stderr). Are there other macro for CUDA? Is that not possible for CUDA?
The CUDA case should be working just the same.
Samuel
Attachment:
vector_setnxy_test.zip
Description: Zip archive
- Re: [Starpu-devel] Implementation question regarding output type, Johan Mazel, 16/01/2020
Archives gérées par MHonArc 2.6.19+.