Janata Dal (United) leader and former Rajya Sabha MP KC Tyagi on Tuesday announced his decision to quit the party, ANI reported. The development comes a day
Politics

Janata Dal (United) leader and former Rajya Sabha MP KC Tyagi on Tuesday announced his decision to quit the party, ANI reported. The development comes a day after JD(U) chief and Bihar Chief Minister Nitish Kumar was elected to the Rajya Sabha from the state. Kumar

The Uttar Pradesh Board of Secondary Education (UPMSP) is set to commence the evaluation process for the classes 10th and 12th examinations starting tomorrow
Education

The Uttar Pradesh Board of Secondary Education (UPMSP) is set to commence the evaluation process for the classes 10th and 12th examinations starting tomorrow, March 18. Over 54 lakh students had appeared for the examination this year. After the evaluation of UP Board exam papers is complete

Meta Platforms shares rose nearly 3% on Monday after a Reuters ‌report that the social media giant plans to lay off 20% or ​more of its workforce to offset
Business

Meta Platforms shares rose nearly 3% on Monday after a Reuters ‌report that the social media giant plans to lay off 20% or ​more of its workforce to offset heavy spending on artificial intelligence ⁠and bet on productivity gains from the technology. If Meta settles on the 20% figure

AB de Villiers has proposed that MS Dhoni should elevate his batting position within the Chennai Super Kings lineup. On another note
Sports

AB de Villiers has proposed that MS Dhoni should elevate his batting position within the Chennai Super Kings lineup. On another note, Anil Kumble has endorsed the idea of integrating Sanju Samson into the team's leadership structure, praising his stellar form and rapport with supporters

<h4 class=
Life Style

Oncologist explains why women are more likely to get thyroid cancer, shares 5 warning signs: Voice changes, lump in neckHear from an oncologist on why women are more vulnerable to this type of cancer than men, and what are some signs that can help one stay alert. Published on: Mar 17

OpenAI introduces FrontierScience to test AI’s expert-level scientific reasoning

Posted By: Ramesh Sharma Posted On: Dec 17, 2025Share Article
OpenAI introduces FrontierScience to test AI’s expert-level scientific reasoning
FrontierScience aims to test how deeply AI systems can reason in science. (File Photo: AP)

OpenAI introduces FrontierScience to test AI’s expert-level scientific reasoning across physics, chemistry, biology

OpenAI on December 16 announced FrontierScience, a new benchmark designed to evaluate artificial intelligence systems on expert-level scientific reasoning across physics, chemistry and biology, as AI models increasingly demonstrate their ability to support real scientific research.

The company said reasoning lies at the heart of scientific work, going beyond factual recall to include hypothesis generation, testing, refinement and cross-disciplinary synthesis. As AI systems grow more capable, OpenAI said the key question is how deeply they can reason to meaningfully contribute to scientific discovery.

AI models increasingly used in real research

Over the past year, OpenAI's models have reached major milestones, including gold-medal-level performance at the International Math Olympiad and the International Olympiad in Informatics. At the same time, advanced systems such as GPT-5 are already being used by researchers to accelerate scientific workflows.

According to OpenAI, scientists are deploying these models for tasks such as cross-disciplinary literature searches, multilingual research reviews and complex mathematical proofs. In many cases, work that once took days or weeks can now be completed in hours.

This progress was detailed in OpenAI's November 2025 paper, Early science acceleration experiments with GPT-5, which presented early evidence that GPT-5 can measurably speed up scientific workflows.

Why FrontierScience was created

OpenAI said that as models' reasoning and knowledge capabilities scale, existing scientific benchmarks are no longer sufficient. Many prior benchmarks focus on multiple-choice questions, have become saturated, or are not centered on real scientific reasoning.

For example, when the GPQA “Google-Proof” benchmark was released in November 2023, GPT-4 scored 39%, well below the expert baseline of 70%. Two years later, GPT-5.2 scored 92%, highlighting the need for more challenging evaluations.

FrontierScience was created to fill this gap by measuring expert-level scientific capabilities using difficult, original and meaningful questions written and verified by domain experts.

What FrontierScience measures

The full FrontierScience benchmark includes more than 700 textual questions, with 160 in a gold-standard set, spanning subfields across physics, chemistry and biology.

It is divided into two tracks:

-FrontierScience-Olympiad:

-100 short-answer questions

-Designed by international science olympiad medalists

-Focused on constrained, theoretical scientific reasoning

-Difficulty at least comparable to international olympiad competitions

FrontierScience-Research:

-60 original research subtasks

-Written by PhD-level scientists

-Designed to reflect real-world, multi-step research challenges

-Graded using a detailed 10-point rubric

Each task was authored and verified by subject-matter experts. Olympiad contributors were medalists in at least one international competition, while Research contributors all held relevant PhD degrees.

How model performance is graded

Olympiad questions are graded using short answers, such as numerical values, expressions or fuzzy string matches, allowing for clear verification.

For Research tasks, OpenAI introduced a rubric-based grading system. Each question includes multiple objectively assessable criteria totaling 10 points, evaluating both final answers and intermediate reasoning steps. A score of 7 out of 10 or higher is considered correct.

Responses are evaluated using a model-based grader (GPT-5). While human expert grading would be ideal, OpenAI said it is not scalable at this level, so rubrics were designed to be reliably checked by a model-based system, supported by a verification pipeline.

How leading AI models performed

OpenAI evaluated several frontier AI models on FrontierScience, including GPT-5.2, Claude Opus 4.5, Gemini 3 Pro, GPT-4o, OpenAI o4-mini and OpenAI o3.

In the initial results:

-GPT-5.2 scored 77% on FrontierScience-Olympiad

-GPT-5.2 scored 25% on FrontierScience-Research

-Gemini 3 Pro closely matched GPT-5.2 on the Olympiad track with a 76% score

OpenAI said the results show substantial progress in expert-level reasoning, while leaving significant headroom for improvement, particularly on open-ended research tasks.

Strengths, limits and next steps

While FrontierScience represents a step forward in evaluating scientific reasoning, OpenAI acknowledged key limitations. The benchmark focuses on constrained, expert-written problems and does not fully capture how science is conducted in practice.

In particular, it does not assess how models generate genuinely novel hypotheses, work with experimental systems, or interact with multimodal data such as video and physical-world experiments.

Looking ahead, OpenAI said progress in scientific reasoning will come from both stronger general-purpose reasoning systems and targeted improvements in scientific capabilities. FrontierScience is one tool among many, and the company plans to expand the benchmark to new domains and pair it with real-world evaluations.

Ultimately, OpenAI said, the most important measure of AI's scientific value will be the new discoveries it helps generate—and FrontierScience is designed to serve as an early indicator of that potential.

Key takeaways:

-OpenAI launched FrontierScience to test AI on expert-level scientific reasoning across physics, chemistry and biology.

-Focus is on reasoning, not recall, including hypothesis generation, testing and cross-disciplinary thinking.

-AI models like GPT-5 are already accelerating research, cutting tasks from weeks to hours.

-Existing science benchmarks are no longer sufficient, prompting the need for harder, expert-written evaluations.

-FrontierScience has two tracks: Olympiad (theoretical reasoning) and Research (real-world, multi-step tasks).

-GPT-5.2 leads performance, scoring 77% on Olympiad tasks and 25% on Research tasks.

Comment on Post

Leave a comment

If you have a News Orbit 360 user account, your address will be used to display your profile picture.


<strong>New Renault Duster: </strong>Renault has launched the all-new Duster in India at a starting price of Rs 10.49 lakh
Latest News
New 2026 Renault Duster launched in India at Rs 10

New Renault Duster: Renault has launched the all-new Duster in India at a starting price of Rs 10.49 lakh. The top-end variant goes up to Rs 18.49 lakh. All prices are ex-showroom, Delhi. 2026 Renault Duster: Renault has launched the all-new Duster in India at a starting price of Rs 10.49 lakh

1 hours ago

Sourav Ganguly joins TMC rumours: Sourav Ganguly has officially quashed rumours of his political entry ahead of West Bengal's upcoming Assembly elections
Politics
Sourav Ganguly to join politics before West Bengal Elections 2026

Sourav Ganguly joins TMC rumours: Sourav Ganguly has officially quashed rumours of his political entry ahead of West Bengal's upcoming Assembly elections. His office stated these claims are entirely baseless. Meanwhile, the Election Commission of India is implementing robust

1 hours ago

<h4 class=
Latest News
Mumbai plumber's ₹18 LPA income shocks internet

Mumbai man says plumber earns ₹18 LPA, owns house and car: 'Contemplating my life choices'The plumber explained how he earns from multiple societies where he regularly handles plumbing work. Updated on: Mar 17, 2026 5:48 PM IST By Bhavya Sukheja Share via Copy link A Reddit post claiming that a

1 hours ago

<h4 class=
Latest News
Former MP KC Tyagi quits JD(U)

Former MP KC Tyagi quits JD(U)His exit comes amid shifting political currents within the JD(U) and its alliances, and is being viewed in political circles as an indication of possible churn within the party’s senior leadership Published on: Mar 17, 2026 6:01 PM IST By Sanjeev K Jha

1 hours ago

<strong>Naseeruddin Shah </strong>and <strong>Om Puri</strong> shared a decades-long friendship and creative partnership
Life Style
When Naseeruddin Shah was stabbed

Naseeruddin Shah and Om Puri shared a decades-long friendship and creative partnership, marked by unwavering loyalty, mutual respect, and a shared passion for authentic, groundbreaking cinema. In the memoirs of Indian cinema, few friendships have been as legendary as that of Naseeruddin Shah and

1 hours ago

The controversy surrounding the song ‘Sarke Chunar Teri Sarke’ from the Kannada movie KD: The Devil, featuring Nora Fatehi and Sanjay Dutt
Entertainment
Bollywood Has Crossed All Limits

The controversy surrounding the song ‘Sarke Chunar Teri Sarke’ from the Kannada movie KD: The Devil, featuring Nora Fatehi and Sanjay Dutt, continues to intensify. After the song’s lyrical video went viral on social media, many netizens and celebrities slammed the song’s ‘vulgar’ lyrics

1 hours ago

West Bengal Chief Minister Mamata Banerjee will contest from Bhabanipur. She faces BJP leader Suvendu Adhikari in a key electoral battle
Politics
West Bengal TMC candidates list

West Bengal Chief Minister Mamata Banerjee will contest from Bhabanipur. She faces BJP leader Suvendu Adhikari in a key electoral battle. The Trinamool Congress has released its candidate list for all 294 seats. The upcoming elections are crucial for Banerjee's political future

1 hours ago

A number of festivals are lined up this week, including Eid-ul-Fitr, Chaitra Navratri and Gudi Padwa. As Muslims and Hindus alike look forward to these
Latest News
Festivals this week (India)

A number of festivals are lined up this week, including Eid-ul-Fitr, Chaitra Navratri and Gudi Padwa. As Muslims and Hindus alike look forward to these auspicious festivals, preparations are in full swing to mark the celebrations this year. This week’s festivities begin with Gudi Padwa and

1 hours ago

Joan Laporta was reelected as the president of FC Barcelona for another five years after winning a leadership vote among members of the Spanish soccer
Sports
Laporta wins 5 more years as president of Barcelona after thousands vote in club election

Joan Laporta was reelected as the president of FC Barcelona for another five years after winning a leadership vote among members of the Spanish soccer powerhouse. Shortly after midnight in Barcelona, the club announced Laporta had won. His only rival, Víctor Font

1 hours ago

The Odisha Congress on Tuesday suspended three of its MLAs for defying the party whip during the recently held Rajya Sabha elections
Politics
Congress Suspends 3 Odisha MLAs For Defying Party Whip In Rajya Sabha Polls

The Odisha Congress on Tuesday suspended three of its MLAs for defying the party whip during the recently held Rajya Sabha elections, in a move aimed at enforcing discipline within the party. The suspended legislators — Ramesh Jena, Sofia Firdous and Dasarathi Gamang — were accused of

1 hours ago

US President Donald Trump was cautioned ahead of military action against Iran that Tehran could retaliate against American allies in the Gulf
World
Donald Trump Was Warned Of Likely Iranian Retaliation On US Allies In Gulf

US President Donald Trump was cautioned ahead of military action against Iran that Tehran could retaliate against American allies in the Gulf, according to officials familiar with US intelligence assessments. Sources said pre-war briefings had identified Iranian retaliation as a likely scenario

1 hours ago

<h4 class=
Latest News
Akshay Kumar opens up about a ‘ghost’ incident at home

Akshay Kumar opens up about spooky incident at home: ‘My son pointed at nothing and asked someone to leave…’As he returns to the horror-comedy genre, Akshay Kumar spoke about an eerie experience that once left him shaken Published on: Mar 17

1 hours ago

“Micro dreaming games
Life Style
What Are Micro Dreaming Games

“Micro dreaming games" are trending on social media, with many claiming they help anxious minds and people with insomnia fall asleep faster. The idea is gaining popularity largely because of how simple it sounds. For anyone who has spent hours staring at the ceiling

1 hours ago

FM Sitharaman’s remarks come amid broader discussions on financial sector health, where overall MSME NPAs remain lower at around 3
Business
PM Mudra Yojana continues to empower small entrepreneurs with collateral-free loans

FM Sitharaman’s remarks come amid broader discussions on financial sector health, where overall MSME NPAs remain lower at around 3.6 per cent as of March 2025, compared to Mudra-specific figures. New Delhi: Finance Minister Nirmala Sitharaman has reaffirmed the transformative role of the Pradhan

1 hours ago

The liver is one of the most vital organs in the human body. From digesting the food we eat to filtering toxins from the blood, it performs hundreds of
Latest News
Daily Yoga For Liver Health

The liver is one of the most vital organs in the human body. From digesting the food we eat to filtering toxins from the blood, it performs hundreds of essential functions that keep the body running smoothly. However, modern lifestyles and unhealthy eating habits have led to a rise in serious

1 hours ago

<strong>Vivo T5x 5G Price In India: </strong>The smartphone supports 5G, Bluetooth 5.4, Wi-Fi, GPS, BeiDou, GLONASS, Galileo, QZSS
Technology
Vivo T5x 5G launched in India with 7

Vivo T5x 5G Price In India: The smartphone supports 5G, Bluetooth 5.4, Wi-Fi, GPS, BeiDou, GLONASS, Galileo, QZSS, and a USB Type-C port. The phone also includes sensors like an accelerometer, ambient light sensor, e-compass, gyroscope, and proximity sensor

1 hours ago

Mustard oil is one of those kitchen staples you rarely question. It sits on the shelf, gets heated till it smokes, and finds its way into everything from
Life Style
Is Your Mustard Oil Real

Mustard oil is one of those kitchen staples you rarely question. It sits on the shelf, gets heated till it smokes, and finds its way into everything from tadkas to pickles. But lately, there’s been a growing unease around what exactly is going into that bottle

1 hours ago

The National Investigation Agency has arrested seven foreign citizens on charges of conspiring to carry out terrorist activities against India
World
NIA arrests seven foreign citizens for allegedly conspiring to carry out terror activities

The National Investigation Agency has arrested seven foreign citizens on charges of conspiring to carry out terrorist activities against India, ThePrint reported on Monday. While six of them are Ukrainians, one is from the United States, The Indian Express reported

1 hours ago

As West Bengal gears up for assembly elections, the Trinamool Congress announced its candidate list, a critical exercise to maintain dominance after 15 years
Politics
TMC West Bengal candidate list

As West Bengal gears up for assembly elections, the Trinamool Congress announced its candidate list, a critical exercise to maintain dominance after 15 years. The party focused on balancing experience with youth, analyzing local equations and feedback to counter anti-incumbency and ensure

1 hours ago

Eid is a time of celebration, togetherness, and sharing delicious food with loved ones. This festive season, add a special touch to your Eid table with
Life Style
Eid 2026 Special

Eid is a time of celebration, togetherness, and sharing delicious food with loved ones. This festive season, add a special touch to your Eid table with flavorful and nourishing recipes made with nuts, offering the perfect balance of taste, tradition, and festive indulgence

1 hours ago

China announced emergency humanitarian assistance for Iran, Jordan, Lebanon, and Iraq, aiming to alleviate the severe humanitarian crisis in the region
World
China to provide humanitarian aid to Iran

China announced emergency humanitarian assistance for Iran, Jordan, Lebanon, and Iraq, aiming to alleviate the severe humanitarian crisis in the region. The foreign ministry expressed deep sympathy for affected populations, emphasizing the ongoing conflict's devastating impact

1 hours ago

The National Investigation Agency (NIA) had arrested seven foreign nationals, six Ukrainians and one American for allegedly providing terrorist training to
World
Who is Matthew VanDyke

The National Investigation Agency (NIA) had arrested seven foreign nationals, six Ukrainians and one American for allegedly providing terrorist training to insurgent groups in Myanmar to carry out attacks in northeast India. The National Investigation Agency (NIA) had a major breakthrough on March

1 hours ago

<h4 class=
Life Style
If you love Korean cuisine

If you love Korean cuisine, these K-dramas are a must-watchIf you love to savour Korean cuisines, these K-dramas featuring food as the main character are definitely a must-watch. Updated on: Mar 17, 2026 5:14 PM IST By Anukriti Srivastava Share via Copy link Korean dramas have taken the world by

1 hours ago

A 5.8-magnitude earthquake struck off the coast of Cuba early on Tuesday (March 17, 2026), the United States Geological Survey (USGS) said
World
Magnitude 5.8 earthquake hits off Cuba

A 5.8-magnitude earthquake struck off the coast of Cuba early on Tuesday (March 17, 2026), the United States Geological Survey (USGS) said. The quake hit at a depth of 11.6 kilometers (7.2 miles) about 49 kilometers south-southwest of Cuba's easternmost municipality of Maisi at 12:28 a.m.

1 hours ago

In the third week of the joint US-Israeli war against Iran, Donald Trump faces decisions that could define the rest of his presidency
Latest News
Surge in US petrol prices deepens political peril for Trump over Iran

In the third week of the joint US-Israeli war against Iran, Donald Trump faces decisions that could define the rest of his presidency. But if the American commander-in-chief is grappling with a war of choice that seems in danger of spiralling in ways he can't control

1 hours ago

<h4 class=
Latest News
I want Nayanthara

‘I want Nayanthara’: AIADMK MP remarks while criticising Stalin at women safety rally stokes controversyWhile participating in the party's protest against the DMK dispensation in Villupuram , Shanmugam, a former state minister remarked "I want Nayanthara''. Published on: Mar 17

1 hours ago

<h4 class=
Entertainment
All Indian Cine Workers Association demands immediate ban on song Sarke Chunar

All Indian Cine Workers Association demands ban on Nora Fatehi song Sarke Chunar, FIR against producersThe All Indian Cine Workers Association has said that songs such as Sarke Chunar with vulgar and double-meaning content is unacceptable in Indian cinema. Mar 17, 2026

1 hours ago


Sing Up