Yesterday, in ‘part 1’ of ‘AI Stands to Reason’, I described how both OpenAI and Google had updates on the AI ‘Reasoning’ to ‘Agent’ and beyond to ‘AGI’ (Artificial General Intelligence’ holy grails this AI Tech Wave. Today, I’d like to delve into the complexity of just these two near-term Levels 2 and 3 priorities. And a glimpse into the underlying cutting edge technology innovations. OpenAI reportedly describes them as their ‘Strawberry’ project.
As Axios reports in “OpenAI nears ‘reasoning’-capable AI’”:
“Reuters reported details of a long-rumored project at OpenAI aimed at developing AI that can reason at a human level, plan ahead, work out problems with multiple steps and independently perform "deep research."
“The project, named "Strawberry," is the current version of an effort Reuters reported last year under a different name, "Q*."
I’ve discussed Q*, as part of at least 7 AI products and services in OpenAI’s pipeline this year into next. This of course includes GPT 5, the next iteration of their current state of the art GPT-4 Omni.
Axios continues:
“OpenAI researchers believe the company is closing in on building AI that can perform human-level "reasoning," per reports out of Bloomberg and Reuters.”
“Why it matters: AI experts disagree over whether today's large language models, which excel at generating text and images, will ever be capable of broadly understanding the world and flexibly adapting to novel information and circumstances.”
“Driving the news: OpenAI has internally shared definitions for five levels of artificial general intelligence (AGI), according to Bloomberg. An OpenAI document Bloomberg reproduced defines the levels:”
Chatbots: AI with conversational language
Reasoners: human-level problem-solving
Agents: systems that can take actions
Innovators: AI that can aid in invention
Organizations: AI that can do the work of an organization
“State of play: At a company meeting last week, per Bloomberg, OpenAI leaders told staff their systems currently worked at level 1 but were "on the cusp" of achieving level 2.”
“Between the lines: Rumors of a major breakthrough at the company had spread widely in the days leading up to the OpenAI board's failed effort to oust CEO Sam Altman in November 2023.”
“A day before the board announced his firing, Altman told an audience at an Asia-Pacific Economic Cooperation event that during the "last couple of weeks" he'd been "in the room" as the company "push[ed] the veil of ignorance back and the frontier of discovery forward."
“Yes, but: Eight months later, OpenAI's latest product is a new version of ChatGPT, GPT-4o, that combines text and visual modes in new, advanced ways. Human reasoning remains somewhere on the horizon.”
But it all gets gnarly in a hurry, as we start to see in Reuters’ piece, “OpenAI working on new reasoning technology under code name Strawberry”:
“How Strawberry works is a tightly kept secret even within OpenAI, the person said.”
“The document describes a project that uses Strawberry models with the aim of enabling the company’s AI to not just generate answers to queries but to plan ahead enough to navigate the internet autonomously and reliably to perform what OpenAI terms “deep research,” according to the source.”
“This is something that has eluded AI models to date, according to interviews with more than a dozen AI researchers.”
The whole piece is worth reading in detail, but the longer-term OpenAI objectives are clear:
“Among the capabilities OpenAI is aiming Strawberry at is performing long-horizon tasks (LHT), the document says, referring to complex tasks that require a model to plan ahead and perform a series of actions over an extended period of time, the first source explained.”
“To do so, OpenAI is creating, training and evaluating the models on what the company calls a “deep-research” dataset, according to the OpenAI internal documentation. Reuters was unable to determine what is in that dataset or how long an extended period would mean.”
“OpenAI specifically wants its models to use these capabilities to conduct research by browsing the web autonomously with the assistance of a “CUA,” or a computer-using agent, that can take actions based on its findings, according to the document and one of the sources. OpenAI also plans to test its capabilities on doing the work of software and machine learning engineers.”
The going out to the internet ‘to browse the web autonomously’ is an automated agency that the industry has long worried about. It of course needs to be done with safety, privacy and trust priorities and guardrails in mind. That in itself could add months if not years to the time table of getting it done right.
These approaches are not just in OpenAI’s area of breakneck research and development. As Reuters explains, earlier research has been pointing to this area earlier:
“Strawberry has similarities to a method developed at Stanford in 2022 called "Self-Taught Reasoner” or “STaR”, one of the sources with knowledge of the matter said. STaR enables AI models to “bootstrap” themselves into higher intelligence levels via iteratively creating their own training data, and in theory could be used to get language models to transcend human-level intelligence, one of its creators, Stanford professor Noah Goodman, told Reuters.”
“I think that is both exciting and terrifying…if things keep going in that direction we have some serious things to think about as humans,” Goodman said. Goodman is not affiliated with OpenAI and is not familiar with Strawberry.”
The overall point is one I’ve made before: ‘The AI toothpaste is out of the tube’
The core broad ideas around LLM AIs are now being worked on by a wide array of companies large and small in this AI Tech Wave.
And companies like Nvidia, Microsoft, Apple, Amazon, Google, Meta, and many others are spending hundreds of billions. All to make sure there’s sufficient AI GPU chips, infrastructure, data centers, power and talent to make these and newer AI Reasoning and Agent technologies possible for mainstream masses.
We’re in for a rapid phase of AI industry work towards AI Reasoning and Agents driven workflows, while Scaling AI Trust as well. It’s a whole new Level on the AI Tech Wave to date. However long it takes. Stay tuned.
(NOTE: The discussions here are for information purposes only, and not meant as investment advice at any time. Thanks for joining us here)