Skip to content

Artificial Intelligence Struggles with Time Telling Basics, Humans Exhibit Superior Skills in Reading Clocks

Artificial intelligence systems struggle in comprehending analog clocks, as revealed by a comprehensive study pitting 11 top AI models against humans. In contrast to human performance of 89.1% accuracy, Google's top-performing AI model scored a dismal 13.3% on the same time-telling test. The...

Artificial Intelligence Struggles with Time Telling as Humans Sail Smoothly Through
Artificial Intelligence Struggles with Time Telling as Humans Sail Smoothly Through

Artificial Intelligence Struggles with Time Telling Basics, Humans Exhibit Superior Skills in Reading Clocks

In a recent study titled "ClockBench," 11 leading artificial intelligence (AI) models were tested on their ability to read analog clocks. The research, conducted by Alek Safar, evaluated the models' performance in interpreting various clock designs, including those with Roman numerals, mirrored or backwards faces, and complex backgrounds.

The study aimed to measure the AI models' visual reasoning capabilities, as reading analog clocks requires identifying clock hands, understanding their relationships, and translating visual positioning into numerical time.

The results were striking. Human accuracy in telling time was a commendable 89.1%, while the best-performing AI model, Google's Gemini 2.5 Pro, managed only 13.3% accuracy. Other models fared even worse, with xAI's Grok 4 posting surprisingly poor results at 0.7% accuracy and incorrectly flagging 63% of all clocks as showing impossible times.

OpenAI's GPT-5 scored 8.4% in the test, and Anthropic's Claude 4.1 Opus and Claude 4 Sonnet achieved 5.6% and 4.2% accuracy, respectively. Google's Gemini 2.5 Flash reached 10.5% accuracy.

The study builds on the "easy for humans, hard for AI" benchmark approach seen in tests like ARC-AGI and SimpleBench. It suggests that the core challenge lies in the initial visual recognition rather than mathematical reasoning. AI systems particularly struggled with clocks that presented complex visual cues, such as colourful backgrounds or intricate designs.

The benchmark tested systems from Google, OpenAI, Anthropic, and other major AI labs using 180 custom-designed analog clocks. Interestingly, only 20.6% of the clocks actually showed impossible times in the study.

The findings highlight fundamental limitations in how AI systems process and reason about visual information. The research also suggests that current scaling approaches may not solve visual reasoning challenges in AI systems. As the development of AI continues, addressing these visual recognition issues will be crucial for AI models to function effectively in real-world scenarios.

Read also:

Latest

Convergent Finance's Advisory Recommends Purchase of a 10.3% ShARE in Knowledge Marine &...

Convergent Finance's funds have been approved to purchase a 10.3% share in Knowledge Marine & Engineering Works Limited, which amounts to USD 27.4 million.

Investment of INR 2,400 million (USD 27.4 million) by Convergent Finance LLP ("Convergent") to broaden Knowledge Marine & Engineering Works Limited ("Knowledge Marine")'s marine vessels fleet for business expansion. Knowledge Marine will leverage this investment to secure both domestic and...