OpenAI has introduced its latest generative model, GPT-4o mini, a streamlined and cost-effective version of GPT-4o. This new model is designed to be less resource-intensive and more affordable to operate, broadening its integration potential across a wider range of products.
Key Updates with GPT-4o Mini
- Broader Availability: GPT-4o mini is now accessible to users on Free, Plus, and Team tiers via the ChatGPT web and app. It will replace the older GPT-3.5 Turbo for end users starting today. ChatGPT Enterprise subscribers will gain access to GPT-4o mini next week. The previous model remains available to developers through the API until a future retirement date is announced.
- Enhanced Performance: GPT-4o mini features an 82% score on the MMLU reasoning benchmark, outperforming competitors like Gemini 1.5 Flash and Claude 3 Haiku by 3% and 7%, respectively. However, it falls short of the highest MMLU benchmark, held by Google’s Gemini Ultra with a score of 90%.
- Cost Efficiency: The new model is reported to be 60% cheaper to operate compared to GPT-3.5 Turbo. Developers will pay 15 cents per million input tokens and 60 cents per million output tokens. OpenAI positions GPT-4o mini as “the most capable and cost-efficient small model available today.”
- Use Case and Capabilities: GPT-4o mini is ideal for tasks that do not require the full capabilities of a larger model. It excels in performing simpler, high-volume tasks more efficiently. It maintains the same context window size of 128,000 tokens as the full-size model and has the same knowledge cutoff date of October 2023. While the model API currently supports text and vision capabilities, audio and video features are expected to be added in the future.
- Model Transition: For users of the free version, reaching usage limits on GPT-4o will now result in being downgraded to GPT-4o mini rather than GPT-3.5 Turbo, providing an improved experience for non-subscribers.
The release of GPT-4o mini follows OpenAI’s recent update on its anticipated Voice Mode, with a smaller alpha release expected in late July and a broader rollout scheduled for fall.