Apple claims that the AI model for inference has limited capabilities and cannot produce accurate results beyond a certain complexity.
Apple Inc released a research paper over the weekend claiming that AI models for reasoning have limited capabilities and fail to produce accurate results beyond a certain complexity.
In a paper titled "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models through the Lens of Problem Complexity," Apple (NASDAQ:AAPL) researchers said that larger reasoning models (LRMs) have significant gaps in reasoning quality and fail to develop general problem-solving capabilities.
The researchers tested LRM models including OpenAI's O1/o3, DeepSeek-R1, Claude 3.7 Sonnet Thinking, and Gemini Thinking, evaluating them with problems of increasing complexity and deviation from standard AI test benchmarks.
Using a "controlled puzzle environment" to test the models, Apple researchers found that the performance of LRMs deteriorated as complexity increased, eventually dropping to zero accuracy at high complexity.
"We show that state-of-the-art LRMs (e.g., o3-mini, DeepSeek-R1, Claude-3.7-Sonnet-Thinking) still fail to develop generalizable problem-solving capabilities, with accuracy eventually dropping to zero across environments beyond a certain level of complexity," Apple researchers wrote in the paper.
The researchers said that testing revealed that LRMs suffer from "fundamental inefficiencies" and have clear limitations in their ability to scale. The researchers also questioned current evaluation methods for LRMs based on established mathematical benchmarks, and said they designed a more controlled experimental approach by using an algorithmic puzzle environment.
The Apple researchers questioned the claim that LRMs are an important step toward general AI - a theoretical form of AI that can simulate the broad cognitive abilities and problem-solving skills exhibited by humans.
General AI has long been seen as an ultimate goal by major developers, although it remains highly theoretical in nature. Current AI models, especially large language models, use pattern recognition to predict the next word in a sequence to generate new text, which still makes them prone to high error rates and limits their reasoning capabilities.
Apple’s paper comes just days before the company’s June 9 Worldwide Developers Conference, amid low expectations as the company’s AI efforts lag significantly behind those of its competitors.
Despite working with OpenAI to enable AI features in its flagship devices, Apple has struggled to deliver the promised capabilities of its AI product, Apple Intelligence.