Lucknow Super Giants (LSG) skipper Rishabh Pant will play his team's match against Punjab Kings (PBKS), thus allaying injury fears ahead of the important match
Latest News

Lucknow Super Giants (LSG) skipper Rishabh Pant will play his team's match against Punjab Kings (PBKS), thus allaying injury fears ahead of the important match. During LSG's match against Royal Challengers Bengaluru (RCB) on 15 April, Rishabh Pant was hit on his left arm after facing a short length

<h4 class=
Latest News

Who is Kajal Meena? Rajasthan SDM and RAS topper arrested for accepting bribeKajal Meena, an IIT graduate, was held by the Anti-Corruption Bureau (ACB) on April 16, along with two others. Updated on: Apr 19, 2026 5:10 PM IST By Arya Mishra Share via Copy link A sub-divisional magistrate in

It’s one thing to reach the summit. Staying there is the real test. And right now, Jannik Sinner isn’t backing down from either. Fresh off his Monte Carlo
Sports

It’s one thing to reach the summit. Staying there is the real test. And right now, Jannik Sinner isn’t backing down from either. Fresh off his Monte Carlo triumph, the newly crowned World No. 1 has confirmed his participation at the Madrid Open

Uttarakhand Chief Minister Pushkar Singh Dhami on Sunday extended greetings to devotees as the Char Dham Yatra 2026 commenced, urging pilgrims to follow
Latest News

Uttarakhand Chief Minister Pushkar Singh Dhami on Sunday extended greetings to devotees as the Char Dham Yatra 2026 commenced, urging pilgrims to follow guidelines and contribute to environmental conservation during the sacred journey. In a post on X, Dhami said

<h4 class=
Business

Govt okays ₹12,980-crore maritime insurance poolThe decision will help lower costs for Indian vessels as global underwriters have hiked risk-mitigation charges to historic highs due to the war in West Asia. Published on: Apr 19, 2026 1:44 PM IST By Zia Haq

Milk sits quietly at the centre of Indian households, poured into morning tea, stirred into children’s routines, and trusted almost instinctively as a source
Life Style

Milk sits quietly at the centre of Indian households, poured into morning tea, stirred into children’s routines, and trusted almost instinctively as a source of nutrition. But that trust, increasingly, is being questioned. According to Food Safety and Standards Authority of India

<h4 class=
Latest News

Odisha gets India’s first 3D glass chip packaging plantThe facility set up by US-based 3D Glass Solutions Inc. (3DGS) at Info Valley in Bhubaneswar is worth ₹1,943 crore Published on: Apr 19, 2026 4:51 PM IST By Debabrata Mohanty, Bhubaneswar Share via Copy link Odisha chief minister Mohan

<h4 class=
Life Style

Pune dermatologist explains how to decide what hair loss treatment is right for you: Hair transplant, PRP or GFCDr Chavan says hair restoration choices depend on follicle health – PRP and GFC work on existing follicles, while transplants are for advanced baldness. Published on: Apr 19

OpenAI introduces FrontierScience to test AI’s expert-level scientific reasoning

Posted By: Ramesh Sharma Posted On: Dec 17, 2025Share Article
OpenAI introduces FrontierScience to test AI’s expert-level scientific reasoning
FrontierScience aims to test how deeply AI systems can reason in science. (File Photo: AP)

OpenAI introduces FrontierScience to test AI’s expert-level scientific reasoning across physics, chemistry, biology

OpenAI on December 16 announced FrontierScience, a new benchmark designed to evaluate artificial intelligence systems on expert-level scientific reasoning across physics, chemistry and biology, as AI models increasingly demonstrate their ability to support real scientific research.

The company said reasoning lies at the heart of scientific work, going beyond factual recall to include hypothesis generation, testing, refinement and cross-disciplinary synthesis. As AI systems grow more capable, OpenAI said the key question is how deeply they can reason to meaningfully contribute to scientific discovery.

AI models increasingly used in real research

Over the past year, OpenAI's models have reached major milestones, including gold-medal-level performance at the International Math Olympiad and the International Olympiad in Informatics. At the same time, advanced systems such as GPT-5 are already being used by researchers to accelerate scientific workflows.

According to OpenAI, scientists are deploying these models for tasks such as cross-disciplinary literature searches, multilingual research reviews and complex mathematical proofs. In many cases, work that once took days or weeks can now be completed in hours.

This progress was detailed in OpenAI's November 2025 paper, Early science acceleration experiments with GPT-5, which presented early evidence that GPT-5 can measurably speed up scientific workflows.

Why FrontierScience was created

OpenAI said that as models' reasoning and knowledge capabilities scale, existing scientific benchmarks are no longer sufficient. Many prior benchmarks focus on multiple-choice questions, have become saturated, or are not centered on real scientific reasoning.

For example, when the GPQA “Google-Proof” benchmark was released in November 2023, GPT-4 scored 39%, well below the expert baseline of 70%. Two years later, GPT-5.2 scored 92%, highlighting the need for more challenging evaluations.

FrontierScience was created to fill this gap by measuring expert-level scientific capabilities using difficult, original and meaningful questions written and verified by domain experts.

What FrontierScience measures

The full FrontierScience benchmark includes more than 700 textual questions, with 160 in a gold-standard set, spanning subfields across physics, chemistry and biology.

It is divided into two tracks:

-FrontierScience-Olympiad:

-100 short-answer questions

-Designed by international science olympiad medalists

-Focused on constrained, theoretical scientific reasoning

-Difficulty at least comparable to international olympiad competitions

FrontierScience-Research:

-60 original research subtasks

-Written by PhD-level scientists

-Designed to reflect real-world, multi-step research challenges

-Graded using a detailed 10-point rubric

Each task was authored and verified by subject-matter experts. Olympiad contributors were medalists in at least one international competition, while Research contributors all held relevant PhD degrees.

How model performance is graded

Olympiad questions are graded using short answers, such as numerical values, expressions or fuzzy string matches, allowing for clear verification.

For Research tasks, OpenAI introduced a rubric-based grading system. Each question includes multiple objectively assessable criteria totaling 10 points, evaluating both final answers and intermediate reasoning steps. A score of 7 out of 10 or higher is considered correct.

Responses are evaluated using a model-based grader (GPT-5). While human expert grading would be ideal, OpenAI said it is not scalable at this level, so rubrics were designed to be reliably checked by a model-based system, supported by a verification pipeline.

How leading AI models performed

OpenAI evaluated several frontier AI models on FrontierScience, including GPT-5.2, Claude Opus 4.5, Gemini 3 Pro, GPT-4o, OpenAI o4-mini and OpenAI o3.

In the initial results:

-GPT-5.2 scored 77% on FrontierScience-Olympiad

-GPT-5.2 scored 25% on FrontierScience-Research

-Gemini 3 Pro closely matched GPT-5.2 on the Olympiad track with a 76% score

OpenAI said the results show substantial progress in expert-level reasoning, while leaving significant headroom for improvement, particularly on open-ended research tasks.

Strengths, limits and next steps

While FrontierScience represents a step forward in evaluating scientific reasoning, OpenAI acknowledged key limitations. The benchmark focuses on constrained, expert-written problems and does not fully capture how science is conducted in practice.

In particular, it does not assess how models generate genuinely novel hypotheses, work with experimental systems, or interact with multimodal data such as video and physical-world experiments.

Looking ahead, OpenAI said progress in scientific reasoning will come from both stronger general-purpose reasoning systems and targeted improvements in scientific capabilities. FrontierScience is one tool among many, and the company plans to expand the benchmark to new domains and pair it with real-world evaluations.

Ultimately, OpenAI said, the most important measure of AI's scientific value will be the new discoveries it helps generate—and FrontierScience is designed to serve as an early indicator of that potential.

Key takeaways:

-OpenAI launched FrontierScience to test AI on expert-level scientific reasoning across physics, chemistry and biology.

-Focus is on reasoning, not recall, including hypothesis generation, testing and cross-disciplinary thinking.

-AI models like GPT-5 are already accelerating research, cutting tasks from weeks to hours.

-Existing science benchmarks are no longer sufficient, prompting the need for harder, expert-written evaluations.

-FrontierScience has two tracks: Olympiad (theoretical reasoning) and Research (real-world, multi-step tasks).

-GPT-5.2 leads performance, scoring 77% on Olympiad tasks and 25% on Research tasks.

Comment on Post

Leave a comment

If you have a News Orbit 360 user account, your address will be used to display your profile picture.


A Delhi-based entrepreneur has sparked online debate after sharing a raw and honest look at what startup life is really like behind the scenes. Nikhil Gaur
Latest News
Delhi-Based Founder Walks Into Empty Office At 5 AM

A Delhi-based entrepreneur has sparked online debate after sharing a raw and honest look at what startup life is really like behind the scenes. Nikhil Gaur, founder of Hypeschool, shared a video on Instagram showing himself arriving at his office at 5 am

1 days ago

For many years, liver disease has been closely associated with alcohol consumption. However, this is no longer the full picture. Today
Life Style
Liver Health Isn’t Defined By Alcohol

For many years, liver disease has been closely associated with alcohol consumption. However, this is no longer the full picture. Today, doctors are increasingly diagnosing liver disease in people who rarely or never consume alcohol. This condition

1 days ago

Like everyone else, I cannot wait for Sunday's massive game between </b>Manchester City</b> and </b>Arsenal</b>
Sports
Why Man City

Like everyone else, I cannot wait for Sunday's massive game between Manchester City and Arsenal. From a manager's perspective, both teams have many strengths and very few weaknesses. The same applies to the two men who are in charge of them.Like everyone else

1 days ago

Andhra Pradesh Deputy Chief Minister Pawan Kalyan underwent surgery on Saturday evening following a health complication, leaving his fans concerned. Now
Entertainment
Chiranjeevi Shares Health Update On Pawan Kalyan After Surgery

Andhra Pradesh Deputy Chief Minister Pawan Kalyan underwent surgery on Saturday evening following a health complication, leaving his fans concerned. Now, megastar Chiranjeevi has shared a health update on his brother, Pawan Kalyan, confirming that he is safe, stable

1 days ago

<h4 class=
Business
Gold, silver prices on Akshaya Tritiya 2026

Gold, silver prices on Akshaya Tritiya 2026: Check latest rates in Delhi, Mumbai, other citiesIf you’re planning to buy jewellery, keep in mind that the final price may be higher, as jewellers typically add making charges and GST to the base rate. Updated on: Apr 19

1 days ago

The self-enumeration portal for the ongoing Census had been showing Pasighat in Arunachal Pradesh as a Chinese town named Medog, a retired Indian Air Force
World
Census portal shows Arunachal town with Chinese name

The self-enumeration portal for the ongoing Census had been showing Pasighat in Arunachal Pradesh as a Chinese town named Medog, a retired Indian Air Force officer pointed out on Saturday. Hours later, census officials said the error had been resolved

1 days ago

Iran's Parliament Speaker claimed its forces neutralized 180 drones and targeted a US F-35 stealth fighter, signaling advanced defense capabilities
World
Hitting F-35 is not one-off event

Iran's Parliament Speaker claimed its forces neutralized 180 drones and targeted a US F-35 stealth fighter, signaling advanced defense capabilities. He linked these military developments to ongoing indirect talks with Washington, stating that while some consensus exists, major differences persist

1 days ago

<h4 class=
World
Security tightened in Pakistan ahead of potential second round of US-Iran talks

‘Extraordinary’ security measures in Pakistan ahead of potential second round of US-Iran talksAuthorities have announced that from Sunday midnight, several sensitive areas surrounding Nur Khan Airbase and Islamabad International Airport will be sealed. Published on: Apr 19

1 days ago

The poll body noted that over 11,000 violative social media posts have been addressed since March 15, while 3,10,393 complaints were resolved via the C-Vigil
Politics
Political Parties to remove fake content on social media platforms within 3 hours of coming to their notice

The poll body noted that over 11,000 violative social media posts have been addressed since March 15, while 3,10,393 complaints were resolved via the C-Vigil app, highlighting a 96.01% resolution rate within 100 minutes. New Delhii: The (ECI) on Saturday reiterated strict compliance with legal

1 days ago

To understand why India’s China deficit keeps growing, you have to go back further than last year’s trade data. You have to go back 35 years — to 1990
Business
China-India trade deficit balloons to USD 112

To understand why India’s China deficit keeps growing, you have to go back further than last year’s trade data. You have to go back 35 years — to 1990, when India and China were, by most measures, economic equals. New Delhi: There is a number buried inside India’s FY26 trade data that

1 days ago

<h4 class=
Latest News
Blinkit vs Instamart vs roadside vendor

Blinkit vs Instamart vs roadside vendor: Gurgaon woman reveals cheapest grocery optionA Gurgaon woman compared grocery prices and found roadside vendors were cheaper than Blinkit and Instamart. Published on: Apr 19, 2026 4:08 PM IST By Mahipal Singh Chouhan Share via Copy link A Gurgaon woman has

1 days ago

<h4 class=
World
Who was Liv Perrotto

Who was Liv Perrotto? Elon Musk fulfils cancer-struck teen's last wish in heartfelt gestureElon Musk responded to 15-year-old cancer patient Liv Perrotto's questions, revealing he won't develop his own phone and sharing his love for anime. Published on: Apr 19

1 days ago

<h4 class=
Education
SSLC grading row

SSLC grading row: K'taka government files review petition in HCSSLC grading row: K'taka government files review petition in HC Published on: Apr 18, 2026 11:10 PM IST PTI Share via Copy link Bengaluru, The Karnataka government has filed a review petition in the High Court against a single judge

1 days ago

<h4 class=
Entertainment
Kamal Haasan urges Centre to pass women's reservation bill without delimitation

Kamal Haasan urges Centre to implement women's reservation bill without linking it to delimitationKamal Haasan said that if we are serious about women’s empowerment, 33% reservation must be implemented immediately. Apr 19, 2026, 17:12:50 IST By Santanu Das Share via Copy link Actor and Makkal

1 days ago

<h4 class=
Latest News
Northeast woman shares how a jhaadu changed her relationship with Delhi landlord

Northeast woman shares how gifting a jhaadu to Delhi landlord changed their relationship: 'United by jhaadu'A Northeast woman shared how gifting a jhaadu transformed her relationship with her landlord in Delhi. Updated on: Apr 19, 2026 4:25 PM IST By Bhavya Sukheja Share via Copy link A

1 days ago

<h4 class=
Latest News
Kin of accused call him ‘high-performer’

TCS case: Kin of accused call him ‘high-performer’; blame office politics, jealousy for allegationsTCS case: Kin of accused call him ‘high-performer’; blame office politics, jealousy for allegations Published on: Apr 19, 2026 4:49 PM IST PTI Share via Copy link Nashik

1 days ago

<strong>Beer Price India:</strong> The ongoing conflict in West Asia has intensified pressure on the Indian beer sector
Latest News
Hormuz Crisis Could Send Beer Prices Soaring

Beer Price India: The ongoing conflict in West Asia has intensified pressure on the Indian beer sector, triggering a sharp rise in input costs, supply disruptions and continued pricing restrictions. Vivek Gupta, Chief Executive Officer and Managing Director of United Breweries Ltd

1 days ago

Prime Minister Narendra Modi asserted that the West Bengal assembly election is a fight to preserve the state's identity, accusing the Mamata Banerjee
Politics
This assembly poll battle to save Bengal's identity

Prime Minister Narendra Modi asserted that the West Bengal assembly election is a fight to preserve the state's identity, accusing the Mamata Banerjee government of favoring "infiltrators" over native populations. He alleged the TMC aims to form a "government of infiltrators and for infiltrators

1 days ago

New Delhi, Apr 19 (PTI) Delhi Traffic Police has booked 269 people for drunk driving during a special integrated night checking across the national capital
Latest News
269 caught for drunk driving on Saturday night in Delhi

New Delhi, Apr 19 (PTI) Delhi Traffic Police has booked 269 people for drunk driving during a special integrated night checking across the national capital, officials said on Sunday. The three-hour operation was conducted between 9 pm and midnight on Saturday in coordination with local police and

1 days ago

New Delhi:The Reserve Bank of India (RBI) has announced that multiple state governments will raise a total of Rs 16,900 crore through the revised auction of
Business
Multiple states to borrow Rs 16

New Delhi:The Reserve Bank of India (RBI) has announced that multiple state governments will raise a total of Rs 16,900 crore through the revised auction of State Government Securities (SGS), scheduled to be conducted on April 21. According to the Central Bank release

1 days ago

Ukraine's President Volodymyr Zelensky has condemned a US decision to extend the period during which Russia is allowed to sell oil despite Western sanctions
Latest News
Zelensky condemns US extension of Russian sanctions waiver

Ukraine's President Volodymyr Zelensky has condemned a US decision to extend the period during which Russia is allowed to sell oil despite Western sanctions. The move means countries can purchase Russian oil and petroleum products already loaded on vessels at sea until 16 May

1 days ago

The Trinamool Congress on Sunday refuted a news report claiming that political consultancy Indian Political Action Committee has paused its operations in West
Politics
Trinamool denies news report claiming I-PAC paused operations in the state

The Trinamool Congress on Sunday refuted a news report claiming that political consultancy Indian Political Action Committee has paused its operations in West Bengal ahead of the Assembly elections, calling it “completely baseless”. In a statement

1 days ago

BCCI set to extend chief selector Ajit Agarkar’s contract to June 2027 after multiple ICC titles, seeking continuity before next ODI World Cup amid major
Sports
Ajit Agarkar Set To Continue As Chief Selector Till 2027

BCCI set to extend chief selector Ajit Agarkar’s contract to June 2027 after multiple ICC titles, seeking continuity before next ODI World Cup amid major team transition Ajit Agarkar, chairman of India’s senior selection committee, is set to receive a one-year contract extension when his

1 days ago

<h4 class=
Science
Kalki Koechlin says heartbreak caused months of insomnia

Kalki Koechlin says heartbreak caused months of insomnia, left her confused: ‘I just could not sleep'Kalki Koechlin opens up about a challenging period marked by insomnia due to heartbreak, affecting her mental clarity and daily functioning. Apr 19, 2026

1 days ago

Ajit Agarkar will continue as chairman of the senior selection committee. The BCCI renewed his contract for one more year. This decision focuses on the 2027
Sports
BCCI's chief selector Ajit Agarkar's contract renewed till next year

Ajit Agarkar will continue as chairman of the senior selection committee. The BCCI renewed his contract for one more year. This decision focuses on the 2027 ODI World Cup. Under Agarkar, Indian teams reached four ICC tournament finals, winning three. Continuity is a key factor for the board

1 days ago

With Deepika Padukone recently announcing that she is expecting her second child with Ranveer Singh, conversations around pregnancy, nutrition
Life Style
Deepika Padukone’s Pregnancy Puts ‘Eating for Two’ In Focus

With Deepika Padukone recently announcing that she is expecting her second child with Ranveer Singh, conversations around pregnancy, nutrition, and maternal well-being are once again in the spotlight. The couple shared the news through a heartfelt post

1 days ago

Lenskart has publicly rolled out a revised in-store style guide after facing backlash over an alleged internal dress code and a separate Pongal-themed campaign
Latest News
Lenskart Apologises And Releases Fresh Style Guide After Dress Code Controversy

Lenskart has publicly rolled out a revised in-store style guide after facing backlash over an alleged internal dress code and a separate Pongal-themed campaign. The eyewear company said it wants to make its position on cultural and religious expression clear

1 days ago


Sing Up