If you come up with a new system, you're going to want to integrate AI into the ...

trashtester · on Sept 26, 2024

That seems to already be happening with o1 and Orion.

Instead of rewarding the network directly for finding a correct answer, reasoning chains that end up with the correct answer is fed back into the training set.

That way you're training it to develop reasoning processes that end up with correct answers.

And for math problems, you're training it to find ways of generating "proofs" that happen to produce the right result.

While this means that reasoning patterns that are not stricly speaking 100% consistent can be learned, that's not necessarily even a disadvantage, since this allows it to find arguments that are "good enough" to produce the correct output, even where a fully watertight proof may be beyond it.

Kind of like physicists have taken shortcuts like the Dirac Delta function, even before mathematicians could verify that the math was correct.

Anyway, by allowing AI's to generate their own proofs, the number of proofs/reasoning chains for all sorts or problems can be massively expanded, and AI may even invent new ways of reasoning that humans are not even aware of. (For instance because they require combining more factors in one logical step than can fit into human working memory.)

ben_w · on Sept 26, 2024

If the user manual fits into the context window, existing LLMs can already do an OK-but-not-great job. Not previously heard of Tamarin, quick google suggests that's a domain where the standard is theoretically "you need to make zero errors" but in practice is "be better than your opponent because neither of you is close to perfect"? In either case, have you tried giving the entire manual to the LLM context window?

If the new system can be interacted with in a non-destructive manner at low cost and with useful responses, then existing AI can self-generate the training data.

If it merely takes a year, businesses will rush to get that training data even if they need to pay humans for a bit: Cars are an example of "real data is expensive or destructive", it's clearly taking a lot more than a year to get there, and there's a lot of investment in just that.

Pay 10,000 people USD 100,000 each for a year, that billion dollar investment then gets reduced to 2.4 million/year in ChatGPT Plus subscription fees or whatever. Plenty of investors will take that deal… if you can actually be sure it will work.

killerstorm · on Sept 26, 2024

1. In-context learning is a thing.

2. You might need only several hundred of examples for fine-tuning. (OpenAI's minimum is 10 examples.)

3. I don't think research into fine-tuning efficiency have exhausted its possibilities. Fine-tuning is just not a very hot topic, given that general models work so well. In image generation where it matters they quickly got to a point where 1-2 examples are enough. So I won't be surprised if doc-to-model becomes a thing.