Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How many times does this need to be repeated.

It works in this instance. On this run. It is not guaranteed to work next time. There is a error percentage here that makes it _INEVITABLE_ that eventually, with enough executions, the validation will pass when it should fail.

It will choose not to pass this to the validator, at some point in the future. It will create its own validator, at some point in the future. It will simply pretend like it did any of the above, at some point in the future.

This might be fine for your B2B use case. It is not fine for underlying infrastructure for a financial firm or communications.



Every time the LLM uses this tool, the response schema is validated--deterministically. The LLM will never see a non-integer value as output from the tool.


Can you please diagram out, using little text arrows ("->"), what you think is happening so I can just fill in the gap for you?


I write these as part of my job, I know how they work. I'm not going to spend more time explaining to you (and demonstrating!) what is in the spec. Read the spec and let the authors know that they don't understand what they wrote. I've run out of energy in this conversation.


I gave you the chance to be explicit about your mental model of these systems and you run away with very unoriginal grandstand.


llm tool call -> mcp client validates the schema -> mcp client calls the tool -> mcp server validates the schema -> mcp server responds with the result -> mcp client passes the tool result into llm


not a developer.

what happens if this schema validation fails here - what will the mcp server respond with and what will the llm do next (in a deterministic sense)?

llm tool call -> mcp client validates the schema -> mcp client calls the tool -> mcp server validates the schema


They often do fail, at the client level you can just feed the schema validation error message back into the LLM and it corrects itself most of the time. If not the LLM throws itself into a loop until its caller times it out and it sends an error message back to the user.

At the server level it's just a good old JSON API at this point, and the server would send the usual error message it would send out to anyone.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: