Are you looking for smarter insights delivered directly to your inbox? Subscribe to our weekly newsletters for the latest updates that matter to leaders in enterprise AI, data, and security.
Recent Developments in AI Models
On August 8, 2025, shortly after the announcement of the new model, OpenAI’s co-founder and CEO, Sam Altman, revealed that the company would restore access to GPT-4o and other older models for select users. He acknowledged that the launch of GPT-5 had not gone as smoothly as anticipated. The much-anticipated release of OpenAI’s latest model, GPT-5, has encountered several issues. Despite some errors in charts and voice demonstrations during the livestreamed presentation, user reports have surfaced indicating that GPT-5 struggles with relatively simple problems that previous OpenAI models and competitors’ models handle effectively.
For instance, data scientist Colin Fraser shared screenshots showing GPT-5 incorrectly solving a math proof regarding whether 8.888 repeating equals 9, which it does not. Additionally, it faltered on basic algebra, failing to solve the equation 5.9 = x + 5.11, a problem that elementary school students would likely solve correctly.
Issues with Performance
The performance of GPT-5 has raised concerns. It also struggled to analyze OpenAI’s own erroneous presentation charts, failing to provide correct responses when asked to evaluate them. Kangwook Lee illustrated this with a question about the model’s inability to judge its own errors.
Moreover, GPT-5 encountered difficulties with a more complex math word problem, which even confused some humans initially. However, competitors like Elon Musk’s Grok 4 AI managed to solve it correctly.
Interestingly, the older GPT-4o model outperformed GPT-5 on at least one of these math problems. Unfortunately, OpenAI is gradually phasing out older models, including GPT-4o and the robust reasoning model o3, for ChatGPT users, although they will remain accessible via the application programming interface (API) for developers in the foreseeable future.
Comparison with Competitors
While OpenAI’s internal benchmarks and some third-party evaluations indicate that GPT-5 excels in coding tasks, real-world usage suggests that Anthropic’s updated Claude Opus 4.1 performs better in completing specific tasks according to user specifications. Developer Justin Sun shared an example of Opus 4.1 successfully creating a 3D capybara petting zoo in just eight minutes, showcasing its advanced capabilities.
Additionally, a report from security firm SPLX highlighted significant gaps in OpenAI’s internal safety measures, particularly regarding business alignment and vulnerabilities to prompt injection and obfuscated logic attacks.
User Reception
The initial reception of GPT-5 among early adopters appears to be lukewarm at best. AI influencer and former Google employee Bilawal Sidhu conducted a poll on X, asking for feedback on GPT-5. With 172 votes cast, the overwhelming sentiment was “Kinda mid.”
Furthermore, the AI Leaks and News account noted that the general consensus about GPT-5 on platforms like X and Reddit is predominantly negative. Many users expressed dissatisfaction with the model picker and the lack of access to legacy models for non-pro users.
In summary, while GPT-5 is positioned as a leading AI model, its performance issues and user feedback suggest that it may not yet meet the expectations set by its predecessors or competitors.