'This paper presents the first comprehensive framework for fully automatic scientific discovery, enabling frontier large language models (LLMs) to perform research independently and communicate their findings.
We introduce The AI Scientist, which generates novel research ideas, writes code, executes experiments, visualizes results, describes its findings by writing a full scientific paper, and then runs a simulated review process for evaluation.'
'The AI Scientist can produce papers that exceed the acceptance threshold at a top machine learning conference as judged by our automated reviewer.'
This is such a tired trope that should have died in the early 2010s.
Those who don't like their direction should just switch to a different OS.
Cloud integration brings features which customers want. Customers still can't even be trusted to back up their own data. Customers don't want to upgrade their OS because they don't fully value the huge number of advancements that come with major revisions.
I got something like 7-10 licenses for windows 7 via a student program when I was in undergrad. Literally haven't paid for a single windows license in over a decade because the upgrades are free and transferable to new hardware in many cases.
This is not the outcome one would expect if msft was anti-user. Think about it.
> I got something like 7-10 licenses for windows 7 via a student program when I was in undergrad. Literally haven't paid for a single windows license in over a decade because the upgrades are free and transferable to new hardware in many cases.
Not anymore. Now that you will have to pay for the OS, maybe you will think your consumer rights more. It wasn't a charity.
This is a fair critism. However, I don't think it matters much, to the overall argument he makes.
Why?
We don't need AGI to see massive geopolitical disruption. We are already seeing this. The US has put up a GPU wall around China. That is evidence enough.
The capability we have already is enough to greatly upset the balance of power in a variety of spheres... Both sides lag in implementation. The capability is alresdy there. Capability will grow. AGI is not relevant to the concern for security. We've past that point already. The labs are a key NAT security asset. They just are privately owned/operated... For now. This is the way of things. History shows this.
While a year ago China's willingness to come to the table would have been surprising to me... Now... Less surprising.
The US look poised for some very serious capability growth over the next few years. Limits make a lot of sense for the Chinese in this environment. Some of these advancements could greatly erode what strengths they have managed to consolidate.
Used toys to write a working machine vision project over last 2 days.
Key word: working
The bubble is real on both sides. Models have limitations... However, they are not toys. They are powerful tools. I used 3 different SotA models for that project. The time saved is hard to even measure. It's big.
My daughter berated me for using AI (the sentiment among youth is pretty negative, and it is easy to understand why), but I simply responded, "if I don't my peers still will, then we'll be living on the street." And it's true, I've 10x'd my real productivity as a scientist (for example, using llms to help me code one off scripts for data munging, automating our new preprocessing pipelines, etc, quickly generating bullet points for slides).
The trick though is learning how to prompt, and developing the sense that the LLM is stuck with the current prompt and needs another perspective. Funnily enough, the least amount of luck I've had is getting the LLM to write precisely enough for science (yay I still have a job), even without the confabulation, the nuance is lacking...that it's almost always faster for me to write it myself.
> My daughter berated me for using AI (the sentiment among youth is pretty negative, and it is easy to understand why)
I can’t relate. Currently in university. Everyone is thankful to God ChatGPT exists. I’d think it must be a joke, or your daughter somehow managed to live in a social circle which doesn’t yet adopted chatbots for school purposes.
I just don't understand why AI is so polarising on a technology website.
OpenAI have even added a feature to make the completions from GPT near-deterministic (by specifying a seed). It seems that no matter what AI companies do, there will be a vocal minority shouting that it's worthless.
Without details that's a meaningless stat, I remember some pytorch machine vision tutorials promising they'll only take like an hour, including training and also gives a working project at the end.
Perhaps someone knowledgeable could add context as to why they wouldn't include EU. I know they've passed more regulation and earlier... But in technical/legal terms...
Not enough time to ensure compliance?
Capability or privacy issues exceed regulatory framework?
I haven't been following the regulatory side too closely as of late.
1 million token context window. Multi modal inputs. This must be costing Google a fortune to offer free inference with a window of this size. ~200mm USD training costs alone by recent estimates from Stanford(if memory serves).
I'd recommend throwing some thick documentation at it. Images must be uploaded separately. If you use the full window, expect lengthy inference compute times. I've been highly impressed so far. Greatly expands capability for my daily use cases. They say they've stretched it to 10M in research.
I would agree that choice of language 'hidden reasoning' is a poor one.
This paper demonstrates a novel training approach which could yield narrow capability growth on a certain class of tasks.
The narrow test tube environment in which we see better performance hints at the unknown which when better understood could promise further yields down the road.
To my mind, the idea that filler tokens might promote immergent capability leading to broader task complexity capability is more promising than the backdoor risk you lay out. The possible scale in each direction just doesn't seem comparable to me(assuming each scenario plays out in a meaningful way).
Re the article...
A single fundamental breakthrough could make his entire article obsolete in a single month. We've found a lot of limits to LLMs sure... This is always how it goes over the history of AI right? The pace of fundamental breakthroughs seems of more relevant conversation with respect to the prospects for AGI as framed by his article.
The paper also proves that this capability, one unlikely to occur naturally, does not help for tasks where one must create sequentially dependent chains of reasoning, a limiting constraint. At least not without overturning what we believe about TCS.
> A single fundamental breakthrough
Then we'd no longer be talking about transformers. That something unpredicted could happen is trivially true.
> immergent capability
It's specifically trained in, requires heavy supervision and is hard to learn. It's surprising that Transformers can achieve this at all but it's not emergent.
You are taking literally 2-4 token phrases from my comment and attacking them without context. I'll spend time on the latter quote. You quote 'emergent capability'.
A) appreciate you correcting my spelling
B) 'The narrow test tube environment in which we see better performance hints at the unknown which when better understood could promise further yields down the road.
To my mind, the idea that filler tokens might promote immergent capability leading to broader task complexity'
C) Now that we have actual context... I'll leave the rest to the thoughtful reader. I said the following key words: 'hints', 'could', 'might'
D) Who asserted this behavior was emergent?
Recommend slowing down next time. You might get a more clear picture before you attack a straw man. Expect no further exchange. Best of luck.
We introduce The AI Scientist, which generates novel research ideas, writes code, executes experiments, visualizes results, describes its findings by writing a full scientific paper, and then runs a simulated review process for evaluation.'
'The AI Scientist can produce papers that exceed the acceptance threshold at a top machine learning conference as judged by our automated reviewer.'