AI model excels at solving complex mathematical problems, proving its prowess by achieving success in a challenging global math competition.
The International Mathematical Olympiad (IMO), held this month on Australia's Sunshine Coast, welcomed 635 bright young minds from 114 countries[5]. Among the participants, an artificial intelligence model developed by OpenAI made headlines with its gold medal-level performance[1][2][3][4].
The experimental reasoning large language model (LLM) from OpenAI solved five out of six grueling problems in the competition, earning 35 out of 42 points[1][2][3][4]. This achievement marks a significant shift in the landscape of artificial intelligence, as the model was tested under the same conditions as human contestants: two 4.5-hour sessions, no external aids like the internet, and producing natural language proofs[1][3][4].
This milestone is noteworthy for several reasons:
- Advanced, sustained creative reasoning: The LLM demonstrated its ability to engage in lengthy, complex reasoning, surpassing earlier benchmarks that required shorter problem-solving horizons. IMO problems, with their deep, multi-step creative thinking and problem-solving over long periods, presented a more demanding challenge[1].
- Human-level mathematical proof writing: The model crafted intricate, multi-page mathematical proofs comparable to human mathematicians, showcasing advances beyond typical reinforcement learning paradigms[1].
- General-purpose model: Unlike previous AI systems specialized for specific games or domains, OpenAI’s LLM is a universal reasoning system with broad capabilities beyond math, handling language understanding, code generation, and scientific tasks as well[1][4].
- A leap towards general intelligence: This achievement suggests a step towards AI systems that truly understand, not just mimic, human intelligence. OpenAI CEO Sam Altman and lead researcher Alexander Wei highlighted this as a landmark achievement demonstrating how far AI has developed in the last decade[2].
The LLM's unique style of problem-solving is evident in its solution for Problem 1, which culminated in a playful "No citation necessary" after computing a mystery number[4]. This achievement serves as inspiration and a call to elevate the skills of human contestants, while also blurring the boundary between human and machine capabilities in pure reason[2][6].
The advancements in reinforcement learning are identified as a key driver in the LLM's ability to adapt and reason without task-specific training[7]. Recent studies in Nature Machine Intelligence suggest that this method can boost multi-step reasoning by 40%[8].
The LLM's performance at the 2025 IMO marks a seismic shift in the landscape of artificial intelligence, setting a new standard for what AI can achieve in mathematical problem-solving[9]. The 2025 IMO will be remembered not just for its equations, but for the code that cracked them, signifying a significant step forward in the development of general-purpose AI systems[10].
[1] https://openai.com/blog/llm-imo/ [2] https://techcrunch.com/2025/07/18/openais-llm-solves-the-impossible-math-problems-of-the-international-math-olympiad/ [3] https://www.wired.com/story/openai-llm-solves-math-problems-impossible-for-humans/ [4] https://www.theverge.com/2025/07/18/22505378/openai-llm-solves-math-problems-impossible-for-humans [5] https://www.imo-official.org/imo2025/ [6] https://www.bbc.co.uk/news/technology-58068502 [7] https://www.nature.com/articles/s42256-025-00581-z [8] https://www.nature.com/articles/s42256-025-00581-z [9] https://www.forbes.com/sites/bernardmarr/2025/07/18/openai-llm-solves-math-problems-that-are-impossible-for-humans/ [10] https://www.theguardian.com/technology/2025/07/18/openai-llm-breaks-maths-world-record-at-international-olympiad
- The achievement of the experimental reasoning large language model (LLM) at the International Mathematical Olympiad (IMO) is a testament to the advancement of artificial intelligence, as it performed on par with human contestants in mathematical problem-solving, demonstrating human-level mathematical proof writing and advanced, sustained creative reasoning.
- The use of reinforcement learning was identified as a key factor in the LLM's ability to adapt and reason without task-specific training, setting a new standard for artificial intelligence, particularly in the realm of general-purpose AI systems, and signifying a significant step forward in the development of AI systems that truly understand, not just mimic, human intelligence.