OpenAI has released its long-awaited AI model, reviously code-named “Strawberry.”
As expected, the new model dubbed “OpenAI o1-preview” — an entirely new naming convention for the company — is “designed to spend more time thinking” before responding, pushing the boundaries on the kind of “complex tasks” and “harder problems” it can tackle, according to an update from the company.
The model has long been rumored to be a breakthrough in the company’s aim to realize artificial general intelligence, the theoretical point at which an AI could outperform a human. The focus is to give the model a sense of “reasoning,” enabling it to solve more complex math problems, for instance.
And if the company is to be believed, it already has some serious academic chops.
OpenAI claims the model “performs similarly to PhD students on challenging benchmark tasks in physics, chemistry, and biology.”
But as its name suggests, the o1-preview is still in a pretty early state and plenty of future updates are to be expected.
“As an early model, it doesn’t yet have many of the features that make ChatGPT useful, like browsing the web for information and uploading files and images,” the company wrote. “For many common cases GPT-4o will be more capable in the near term.”
As of right now, OpenAI o1 will be available to ChatGPT Plus and Team users. The company is also planning to bring a more lightweight version, dubbed o1-mini, to all free users of ChatGPT, but it has yet to reveal when that will happen.
OpenAI says that it designed its latest AI model with safety top of mind. In one of its “hardest jailbreaking tests,” the new model scored 84 out of 100, compared to just 22 for its predecessor GPT-4o.
The new model “has been trained using a completely new optimization algorithm and a new training dataset specifically tailored for it,” OpenAI’s research lead Jerry Tworek told The Verge.
The company claims o1 could be used “by healthcare researchers to annotate cell sequencing data, by physicists to generate complicated mathematical formulas needed for quantum optics, and by developers in all fields to build and execute multi-step workflows.”
Thanks to its new “chain of thought” process, it evaluates a number of answers to a query before choosing the best one. And that can take a while, especially when compared to the almost instantaneous answers we get from ChatGPT.
In a demo seen by The Verge, the model took 30 seconds to solve a reasoning puzzle involving a princess and prince’s age.
As for its propensity to “hallucinate” facts, a glaring problem that has historically plagued AI chatbots, OpenAI appeared to be more realistic.
“We have noticed that this model hallucinates less,” Tworek told The Verge. But “we can’t say we solved hallucinations.”
The company’s CEO Sam weighed in on the new model, saying it’s “still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.”
Does it amount to AGI, one questioner asked the exec.
“No,” he replied.
More on OpenAI: OpenAI Launching “Strawberry” Model With “Human-Like Reasoning” as Soon as This Week
Source link
lol