schmuhblaster's comments

schmuhblaster · 2026-01-04T03:06:22 1767495982

Hi, I stumbled on this article in my twitter feed and posted it because I found it to be very practical, despite the somewhat misleading title. (and I also don't like encoding agent logic in .md files). For my side project I am experimenting with describing agents / agentic workflows in a Prolog-based DML [1]

[1] https://www.deepclause.ai

schmuhblaster · 2025-12-12T08:51:13 1765529473

Sounds interesting, could you elaborate a bit on this? (I am experimenting in a similar direction)

schmuhblaster · 2025-12-08T13:07:30 1765199250

This looks like a very pragmatic solution, in line with what seems to be going on in the real world [1], where reliability seems to be one of the biggest issues with agentic systems right now. I've been experimenting with a different approach to increase the amount of determinism in such systems: https://github.com/deepclause/deepclause-desktop. It's based on encoding the entire agent behavior in a small and concise DSL built on top of Prolog. While it's not as flexible as a fully fledged agent, it does however, lead to much more reproducible behavior and a more graceful handling of edge-cases.

[1] https://arxiv.org/abs/2512.04123

schmuhblaster · 2025-12-01T10:12:19 1764583939

> But my bet is that the proposed program-of-thought is too specific

This is my impression as well, having worked with this type of stuff for the past two years. It works great for very well defined uses case and if user queries do not stray to far from what you optimized your framework/system prompt/agent for. However, once you move too far away from that, it quickly breaks down.

Nevertheless, as this problem has been bugging me for a while, I still haven't given up (although I probably should ;-). My latest attempt is a Prolog-based DSL (http://github.com/deepclause/deepclause.ai) that allows for part of the logic to be handled by LLMs again, so that it retains some of the features of pure LLM_based systems. As a side effect, this gives additional features such as graceful failures, auditability and increased (but not full) reproducibility.

yencabulator · 2025-12-09T17:45:45 1765302345

That repo is not public.

schmuhblaster · 2025-11-25T12:59:43 1764075583

I've been experimenting with giving the LLM a Prolog-based DSL, used in a CodeAct style pattern similar to Huggingface's smolagents. The DSL can be used to orchestrate several tools (MCP or built in) and LLM prompts. It's still very experimental, but a lot of fun to work with. See here: https://github.com/deepclause/deepclause-desktop.

schmuhblaster · 2025-11-19T16:21:33 1763569293

My own attempt at "chain-of-code with a Prolog DSL": https://news.ycombinator.com/item?id=45937480. Similarly to CodeAct the idea there is to turn natural language task descriptions into small programs. Some program steps are directly executed, some are handed over to an LLM. I haven't run any benchmarks yet, but there should be some classes of tasks where such an approach is more reliable than a "traditional" LLM/tool-calling loop.

Prolog seemed like a natural choice for this (at least to me :-), since it's a relatively simple language that makes it easy to build meta-interpreters and allows for a fairly concise task/workflow representations.

robot-wrangler · 2025-11-19T18:18:13 1763576293

Nice, I do like the direction. A prolog dialect does seem like a natural choice if we must pick only one kind of intermediate representation, but ideally there could be multiple. For example, I saw your "legal reasoning" example.. did you know about https://catala-lang.org/ ? I think I'd like to see an LLM experiment that only outputs formal specifications, but still supports multiple targets (say prolog, z3, storm, prism, alloy and what have you). After you can output these things you can use them in chain-of-code.

Anyway the basic point being.. it is no wonder LLM reasoning abilities suck when we have no decent intermediate representation for "thinking" in terms of set/probability primitives. And it is no wonder LLMs suck at larger code-gen tasks when we have no decent intermediate representation for "thinking" in terms of abstract specifications. The obsession with natural-language inputs/intermediates has been a surprise to me. LLMs are compilers, and we need to walk with various spec -> spec compilers first so that we can run with spec -> code compilers

schmuhblaster · 2025-11-20T02:09:36 1763604576

Thank you, https://catala-lang.org/ looks very interesting. I've experimented a lot with LLMs producing formal representations of facts and rules. What I've observed is that the resulting systems usually lose a lot of the original generalization capabilities offered by the current generation of LLMs (Finetuning may help in this case, but is often impractical due to missing training data). Together with the usual closed world assumption in e.g. Prolog, this leads to imho overly restrictive applications. So the approach I am taking is to allow the LLM to generate Prolog code that may contain predicates which are interpreted by an LLM.

So one could e.g. have

is_a(dog, animal). is_a(Item, Category) :- @("This predicate should be true if 'Item' is in the category 'Category'").

In this example, evaluation of the is_a predicate would first try to apply the first rule and if that fails fallback on to the second rule branch which goes into the LLM. That way the system as a whole does not always fail, if the formal knowledge representation is incomplete.

I've also been thinking about the Spec->Spec compilation use case. So the original Spec could be turned into something like:

spec :- setup_env, create_scaffold, add_datamodel,...

I am honestly not sure where such an approach might ultimately be most valuable. "Anything-tools" like LLMs make it surprisingly hard to focus on an individual use case.

schmuhblaster · 2025-11-16T11:52:14 1763293934

This is my own recent attempt at this:

https://news.ycombinator.com/item?id=45937480

The core idea of DeepClause is to use a custom Prolog-based DSL together with a metainterpreter implemented in Prolog that can keep track of execution state and implicitly manage conversational memory for an LLM. The DSL itself comes with special predicates that are interpreted by an LLM. "Vague" parts of the reasoning chain can thus be handed off to a (reasonably) advanced LLM.

Would love to collect some feedback and interesting ideas for possible applications.