Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
maciejgryka
on May 15, 2024
|
parent
|
context
|
favorite
| on:
PaliGemma: Open-Source Multimodal Model by Google
I haven’t tried this yet, excited to see how it can do segmentation by outputting series of coordinates! That's something I just assumed transformers will generally be bad at.
yeldarb
on May 15, 2024
[–]
How it does this is really cool. It’s got a VAE decoder. Reminds me a lot of how SAM works.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: