Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
OpenAI surprised the world yesterday afternoon by announcing not “Strawberry” as rumored, nor GPT-5, but a new family of “reasoning” large language models (LLMs) called o1 that aims to offer high performance and accuracy on tasks related to science, technology, engineering and math (STEM) fields.
OpenAI’s two new models are o1-preview and the lower-parameter (less advanced) o1-mini, available now to ChatGPT Plus users as well as developers who use OpenAI’s paid application programming interface (API). This way, developers can test them as the backend of existing third-party apps and services, or build new apps and services atop them.
The new o1 models use a form of “reasoning,” according to OpenAI, and they “try different strategies, recognize mistakes, and are doing the full thinking process,” according to Michelle Pokrass, OpenAI’s API Tech Lead, who shared some of the thinking behind the development of the models in a video call interview with VentureBeat.
“In our tests, these models perform pretty similarly to PhD students on kind of some of the most challenging benchmarks,” Pokrass noted.
Specifically, the o1 models “perform much better” than the GPT series on “reasoning-related problems,” said Nikunj Handa, who works on Product at OpenAI, and also took time to share thoughts about the o1 model family for VentureBeat.
Here’s what third-party developers should know about the new o1-preview and o1-mini models.
Limited to text — no image or file analysis — and slower…for now
The o1-preview and o1-min models are limited to text inputs and outputs for now, and are therefore unlikely at this time to supplant third-party developers’ usage of GPT-4o, OpenAI’s last most advanced model, which offers multimodal inputs and outputs including analyzing file attachments and generating imagery.
The o1 series models aren’t multimodal, according to Pokrass and Handa.
The o1 models further aren’t yet able to connect to web browsing, meaning no outside knowledge past their training cutoff date (October 2023), although users can of course provide their own knowledge in the form of text inputs for the model to reference and analyze.
They’re also slower to respond with outputs, taking over a minute — sometimes even several — to respond in some cases.
However, some developers who received early alpha access over the last weeks and months have reported increased performance on tasks such as coding and drafting legal documents, so using one of them could still be a good option for developers looking to experiment and pay more for increased performance.
As OpenAI writes in its API documentation for its new o1-preview and o1-mini reasoning models: “For applications that need image inputs, function calling, or consistently fast response times, the GPT-4o and GPT-4o mini models will continue to be the right choice. However, if you’re aiming to develop applications that demand deep reasoning and can accommodate longer response times, the o1 models could be an excellent choice.”
o1 costs a lot more than other OpenAI models, but o1-mini is a bargain
First up, you need to be a heavy user of OpenAI’s APIs in order to qualify. The o1-preview and o1-mini models are being made available initially to “Tier 5” users — that is, those who have spent $1,000 through the API and made payments to the company at least 30 (or more) days ago.
OpenAI warns that the new o1 models are previews and limited to 20 requests per minute — or 20 calls per minute — compared to other OpenAI models that have higher limits, or are limited by tokens per minute/day.
The company also currently doesn’t accept “batched” requests as it does for other models at a lower price — essentially bunching inputs to the API that don’t require immediate responses, and are instead analyzed and corresponded responses outputted in 24 hours (or less).
The main o1-preview model, which Pokrass says offers much more “world knowledge” of subjects outside of STEM, is the most expensive OpenAI AI model currently offered by a wide margin — costing $15 per 1 million tokens inputted and $60 per 1 million tokens out ($15/$60) versus $5/$15 for GPT-4o, or a 200%-300% more expensive price for the new full o1-preview model.
Yet the o1-mini model is a steal at $3 per 1 million input tokens and $12 per 1 million output tokens, or an 80% cheaper price.
“Of course, we will be retreating the pricing over the coming weeks and months to get this to the right spot,” said Pokrass.
Here’s a breakdown of the pricing of OpenAI’s various leading models through its API — data taken from this page.
When it comes to the context — or how many tokens a given LLM can handle in one interaction, input and output — the o1 series has a limit of 128,000, comparable to GPT-4o and OpenAI’s other top models.
The o1-preview model can produce a maximum of 32,768 tokens in a single output, or response, while the o1-mini can produce double that number at 65,536.
What developers are using OpenAI o1-preview and o1-mini for so far…
It’s been less than 24 hours since OpenAI released o1-previews and o1-mini, but already some developers are thinking up uses for it and testing it out to see what it does well and doesn’t.
And, as previously mentioned, OpenAI did “seed” it amongst a select group of early alpha users and testers over the last few weeks and month.
Based on that work, here are some of the most interesting uses of the o1-preview and o1-mini models so far:
Generating plans and white papers
Several users have reported that the o1 model family generates well developed action plans and even full documents such as white papers with citations based on simple prompts.
Planning, infrastructure, and risk assessment
AI influencer and enterprise consultant Allie K. Miller posted a thread on X of various impressive outputs from OpenAI’s o1-preview model, including automatically (and much more rapidly than a human) optimizing a human staff’s schedules for an organization, assessing merger risks, designing warehouses for efficiency, even balancing a city’s power grid.
Creating apps and games quickly
OpenAI o1-preview seems to be a direct shot across the bow at Anthropic’s Claude family and specifically the Artifacts feature, as it is also a capable and quick way for users to generate their own interactive apps and games, as Ammaar Reshi, Head of Design at AI voice and audio startup ElevenLabs, pointed out on X. Note that he used another software tool, Cursor Composer, to run the model.
However, as Anand Sukumaran, CTO of web notification startup Engagespot posted on his X account, GPT-4o still achieves much faster speeds when coding simple programs such as one to display “Hello, World!”
Completing requests-for-proposal (RFPs) on its own
Contractors, particularly those offering products for government agencies, are all-too familiar with the request-for-proposal (RFP) — a call out by an agency soliciting contract bids in a standardized format that can be tedious and time consuming to fill out.
While specialized and AI-driven software has arisen to help contractors fill out these documents more efficiently, University of Pennsylvania Wharton School of Business Professor Ethan Mollick, a leading AI influencer and early adopter who had access to o1 as part of its alpha testing phase, posted on X that o1 can fill out RFPs on its own — though of course, it is limited to text and doesn’t accept file uploads, so the user would need to copy and paste the text version of the RFP into o1’s context window in ChatGPT or through another app.
Strategizing engagement and growth hacking
Ruben Hassid, founder of EasyGen, a Chrome app for automatically generating LinkedIn posts, posted a demo video on X showing how o1-preview was able to generate a comprehensive and well-reasoned plan for using Reddit to help grow his company.
“I can’t believe the length of the answers. There is no way an LLM is capable of this much strategizing,” he wrote.
Where to get access to OpenAI o1-preview and o1-mini?
Developers can of course access the new OpenAI o1 models through the company’s public API, as well as through Microsoft Azure OpenAI Service, Azure AI Studio, and GitHub Models.
While clearly not right for all (or potentially even most) developers, the o1 family’s debut makes for an exciting time for those with room to experiment and looking to build new apps and services.
OpenAI has also committed to continuing to develop both the capabilities of the o1 family and its GPT series, so there is no shortage of options for those looking to build atop the leading AI company’s platforms.
Source link lol