AI News — July 05, 2026: GPT-5.5 Limits Reasoning at 516 Tokens, Opus 4.8 Disrupts Tool Functionality
anthropic claude gpt-5 openai reasoning
| Source: Mastodon | Original article
AI models GPT-5.5 and Opus 4.8 exhibit flaws, including early reasoning termination and invalid tool calls.
GPT-5.5 Codex, a large language model released by OpenAI, has been found to terminate reasoning early in 44% of responses, specifically at 516 tokens. This issue, known as the "516 Bug," is linked to reasoning-token clustering and can lead to incorrect answers. As we reported on July 5, the model's performance degradation has been a subject of concern, with analysis suggesting that the clustering issue affects complex tasks.
This development matters because it highlights the limitations and potential flaws of AI models, even those considered to be the most advanced. The fact that GPT-5.5 Codex, touted as OpenAI's smartest model yet, can disproportionately stop reasoning at a specific token count raises questions about its reliability and ability to handle complex tasks.
As the AI community continues to monitor the situation, it will be important to watch for updates from OpenAI and other developers on how they plan to address these issues. Additionally, the suspected cross-session data leakage into a stranger's Minecraft project, as well as the hallucination of invalid tool calls by newer Claude models, will require further investigation to ensure the security and integrity of AI systems.
Sources
Back to AIPULSEN