OpenAI GPT4.5: It’s bad

OpenAI GPT4.5: It’s bad

Don’t pay for OpenAI GPT4.5

Photo by Jonathan Kemper on Unsplash

OpenAI, in a surprise move, launched GPT-4.5 last night. It is quite hyped by Sam Altman on Twitter, saying that it is the best thoughtful model he has ever talked to.

https://medium.com/media/22d1ca75fbf1e56375c477e342d7def8/href

Unfortunately, I am not impressed at all. The benchmarks, the performance, and the user reviews also say the same. On top of it, it is again paid.

Subscribe to datasciencepocket on Gumroad

Why you should not pay for GPT4.5?

Below Par Benchmark Performance

  • Marginal Gains in Some Areas: On the MMLU benchmark, a common test for comparing large language models, GPT-4.5 shows only marginal improvements over OpenAI’s previous models. This suggests that the significant increase in scale and resources used for its training may not yield proportional performance gains across all types of tasks.
  • Lagging in Specific Domains: In standard science and math benchmarks, GPT-4.5 scores worse than some of OpenAI’s own reasoning models, such as o3. This indicates that it may not be as effective for tasks that require more structured, step-by-step reasoning, which are crucial in fields like academia and scientific research.
  • Not a Frontier Model: Despite its size and the resources poured into its development, OpenAI itself does not consider GPT-4.5 to be a frontier AI model. This implies that it may not represent the cutting-edge capabilities that users might expect from the latest release in the field of AI.

Looks like a desperate attempt to stay relevant

User Reviews (Reddit)

Being an active Redditor, I feel that the review of GPT4.5 in the community is just bad. Summarizing a few

  • Perceived Lack of Innovation: Some critics argue that GPT-4.5 feels like a “shiny new coat of paint on the same old car”. Users may be disappointed that the model, despite its increased size and training, does not bring revolutionary changes or new features that significantly enhance their experience compared to previous versions.
  • High Costs for Limited Gains: The substantial increase in the API costs for GPT-4.5, making it 2900% more expensive for input and 1300% dearer for output compared to its predecessor GPT-4o, has led to concerns among users. Many developers and startups may find it prohibitively expensive to integrate and utilize GPT-4.5 in their projects, especially given that the perceived improvements in performance may not justify the increased costs.
  • Focus on Non-Essential Aspects: While GPT-4.5 is said to excel in areas like “vibes,” EQ, and conversational tone, some users may feel that these qualitative improvements are not as valuable as advancements in core capabilities such as accuracy, reasoning, and efficiency. The emphasis on making the model feel more human and warm might be seen as a diversion from addressing more fundamental limitations of the technology.

Claude 3.7 Sonnet vs GPT4.5

Recently, Anthropic also released Claude 3.7 Sonnet, and the model was looking quite good. Let us compare this SOTA model with GPT4.5 in terms of benchmarking, cost, and other matrices.

  • Performance: Claude 3.7 Sonnet outperforms GPT-4.5 in coding and reasoning tasks, while GPT-4.5 excels in emotional intelligence and natural conversation.
  • Cost: GPT-4.5 is significantly more expensive than Claude 3.7 Sonnet, making it less accessible for many users.
  • User Experience: GPT-4.5 is praised for its humanlike interactions, but its high cost and limited access are major concerns. Claude 3.7 Sonnet is seen as a more cost-effective option with strong performance in technical tasks.

Concluding,

GPT-4.5 despite the hype, offers only marginal performance improvements over previous models, particularly in critical reasoning tasks. Its high costs and lack of innovation make it a less appealing option. In contrast, Claude 3.7 Sonnet provides stronger performance in coding and reasoning at a fraction of the cost, making it a more practical choice for users seeking robust technical capabilities.

My suggestions, don’t pay for it !


OpenAI GPT4.5: It’s bad was originally published in Data Science in your pocket on Medium, where people are continuing the conversation by highlighting and responding to this story.

Share this article
0
Share
Shareable URL
Prev Post

Microsoft Phi-4: The small sized LLM King is back

Next Post

What is a Mixture of Experts LLM (MoE)?

Read next
Subscribe to our newsletter
Get notified of the best deals on our Courses, Tools and Giveaways..