Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
almostgotcaught
3 months ago
|
parent
|
context
|
favorite
| on:
Prefix sum: 20 GB/s (2.6x baseline)
Lol do you think "PTX programming" is some kind of trick path to perf? It's just inline asm. Sometimes it's necessary but most of the time "CUDA is all you need":
https://github.com/b0nes164/GPUPrefixSums
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
https://github.com/b0nes164/GPUPrefixSums