Some reported benchmarks do seem to be rather dubious, but the 70B model itself is quite real. Sample output:
$ ollama run reflection:70b-q4_0
>>> hello
<thinking>
To respond to "hello", I'll need to consider several factors:
1. The user's intent: They're likely saying hello as a greeting.
2. Appropriate responses: Common ways to respond to "hello" are:
- Hello back
- Hi
- Hey
3. Formality level: Since this is an AI response, I'll aim for a friendly but professional tone.
<reflection>
The approach of responding with a standard greeting seems appropriate in this context. It acknowledges the user's hello and provides a polite response.
</reflection>
Given these considerations, the most suitable response would be to echo "hello" back to the user.
</thinking>
<output>
Hello!
</output>
There was a hackernews post a few days ago, pointing to a reddit thread where some guys proved that the founder/s of relection AI were faking their model by just passing the input to Claude (Sonnet 3.5) and stripping the word "Claude" from the output, amongst other things. Then when they got caught they switched it to GPT 4-o.
After this, I will be very skeptical to anything they claim to achieve.