Totally agree, context-aware misuse is a big gap, and one we’re actively exploring. We’ve built session-level risk tracking and some early logic to detect drift over time, but it’s definitely still evolving.
LLM security isn’t a one-and-done, it’s an ongoing process, especially as attack patterns keep getting more subtle.
If you’ve seen other use cases or edge cases worth considering, we’d love to hear them. And feel free to ask more questions, really appreciate your input!
Thanks for checking this out, happy to answer any questions or feedback. Also curious what others here are using today to secure GenAI deployments? Especially anything for prompt validation, Prompt Injection or hallucination detection?
This looks solid, being able to search across diagrams and videos is a big win. Curious how well it performs with noisy scanned PDFs or annotated images.
Thank you!! It works particularly well with those. We use ColPali-style embeddings for our visual doc search. As a result, we're not limited by parsing quality the same way typical RAG systems are. Here's a link to a blog I wrote on ColPali, for reference: https://docs.morphik.ai/concepts/colpali
This looks super useful, especially with so many folks experimenting with local LLMs now. Curious how well it handles edge devices. Will give it a try!
This is absolutely a real problem, especially in enterprise GenAI rollouts where hallucinations and data leakage risks are non-negotiable.
We’ve run into scenarios where LLMs exposed internal data just through cleverly crafted prompts. Your ability to inspect and enforce policies at both prompt and response level is spot on.
If I were in your shoes, I’d seriously consider open-sourcing the data plane, especially since your control plane is where monetization lies. It builds trust, invites contributions, and positions you as a default in this emerging category.
And no, you're not early, you're exactly on time. Most companies are just realizing how much risk they’ve shipped into production.
Thank you, this really helps. Totally agree—hallucinations and leakage are scary, especially when prompts can be engineered to expose things you didn’t think were vulnerable.
We’ve been leaning toward open-sourcing the data plane for exactly the reasons you mentioned: trust, adoption, and building a community around the core tech. But I’ll be honest—there’s still that fear in the back of my mind: what if someone forks it, strips out the branding, and rehosts it? Or if buyers say “well, it’s open source, why should we pay anything?”
Did you or your team ever wrestle with that? Or have you seen OSS models work well in this space where the control plane still delivers enough value to justify a paid tier?
This feels like a great blend of immersion and repetition. Curious if you’re doing any difficulty adaptation based on content complexity or vocabulary frequency?
For now it's completely local as there is no account registration, but once i introduce account registrations then yes we are introduce the sync mechanism for every one. b.t.w Chrome extension is on the way waiting to be listed by the chrome store team.
Really cool project, the privacy-first angle and self-hosted design are a huge plus. Curious: have you run into any rate limits or session issues with the whatsmeow API, especially when used continuously by agents?
No. I haven't got the chance to benchmark against others. When I ran the benchmarks in the library, I could process 1 million pairs of dependency in 10's of milliseconds, in my 5-year old laptop.
LLM security isn’t a one-and-done, it’s an ongoing process, especially as attack patterns keep getting more subtle.
If you’ve seen other use cases or edge cases worth considering, we’d love to hear them. And feel free to ask more questions, really appreciate your input!