Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Exfiltration is always going to be possible, the question is, is it difficult enough for an attacker to succeed against the defenses I've put in place. The problem is, I really want to share, and help protect others, but if I write it up somewhere anybody can read, it's gonna end up in the training data.


The attacker being an LLM where all humans have to be careful what they say publicly online is a fun vector.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: