Janata Dal (United) leader and former Rajya Sabha MP KC Tyagi on Tuesday announced his decision to quit the party, ANI reported. The development comes a day after JD(U) chief and Bihar Chief Minister Nitish Kumar was elected to the Rajya Sabha from the state. Kumar
The Uttar Pradesh Board of Secondary Education (UPMSP) is set to commence the evaluation process for the classes 10th and 12th examinations starting tomorrow, March 18. Over 54 lakh students had appeared for the examination this year. After the evaluation of UP Board exam papers is complete
Meta Platforms shares rose nearly 3% on Monday after a Reuters report that the social media giant plans to lay off 20% or more of its workforce to offset heavy spending on artificial intelligence and bet on productivity gains from the technology. If Meta settles on the 20% figure
AB de Villiers has proposed that MS Dhoni should elevate his batting position within the Chennai Super Kings lineup. On another note, Anil Kumble has endorsed the idea of integrating Sanju Samson into the team's leadership structure, praising his stellar form and rapport with supporters
Oncologist explains why women are more likely to get thyroid cancer, shares 5 warning signs: Voice changes, lump in neckHear from an oncologist on why women are more vulnerable to this type of cancer than men, and what are some signs that can help one stay alert. Published on: Mar 17
OpenAI introduces FrontierScience to test AI’s expert-level scientific reasoning

OpenAI introduces FrontierScience to test AI’s expert-level scientific reasoning across physics, chemistry, biology
OpenAI on December 16 announced FrontierScience, a new benchmark designed to evaluate artificial intelligence systems on expert-level scientific reasoning across physics, chemistry and biology, as AI models increasingly demonstrate their ability to support real scientific research.
The company said reasoning lies at the heart of scientific work, going beyond factual recall to include hypothesis generation, testing, refinement and cross-disciplinary synthesis. As AI systems grow more capable, OpenAI said the key question is how deeply they can reason to meaningfully contribute to scientific discovery.
AI models increasingly used in real research
Over the past year, OpenAI's models have reached major milestones, including gold-medal-level performance at the International Math Olympiad and the International Olympiad in Informatics. At the same time, advanced systems such as GPT-5 are already being used by researchers to accelerate scientific workflows.According to OpenAI, scientists are deploying these models for tasks such as cross-disciplinary literature searches, multilingual research reviews and complex mathematical proofs. In many cases, work that once took days or weeks can now be completed in hours.
This progress was detailed in OpenAI's November 2025 paper, Early science acceleration experiments with GPT-5, which presented early evidence that GPT-5 can measurably speed up scientific workflows.
Why FrontierScience was created
OpenAI said that as models' reasoning and knowledge capabilities scale, existing scientific benchmarks are no longer sufficient. Many prior benchmarks focus on multiple-choice questions, have become saturated, or are not centered on real scientific reasoning.For example, when the GPQA “Google-Proof” benchmark was released in November 2023, GPT-4 scored 39%, well below the expert baseline of 70%. Two years later, GPT-5.2 scored 92%, highlighting the need for more challenging evaluations.
FrontierScience was created to fill this gap by measuring expert-level scientific capabilities using difficult, original and meaningful questions written and verified by domain experts.
What FrontierScience measures
The full FrontierScience benchmark includes more than 700 textual questions, with 160 in a gold-standard set, spanning subfields across physics, chemistry and biology.It is divided into two tracks:
-FrontierScience-Olympiad:
-100 short-answer questions
-Designed by international science olympiad medalists
-Focused on constrained, theoretical scientific reasoning
-Difficulty at least comparable to international olympiad competitions
FrontierScience-Research:
-60 original research subtasks
-Written by PhD-level scientists
-Designed to reflect real-world, multi-step research challenges
-Graded using a detailed 10-point rubric
Each task was authored and verified by subject-matter experts. Olympiad contributors were medalists in at least one international competition, while Research contributors all held relevant PhD degrees.
How model performance is graded
Olympiad questions are graded using short answers, such as numerical values, expressions or fuzzy string matches, allowing for clear verification.For Research tasks, OpenAI introduced a rubric-based grading system. Each question includes multiple objectively assessable criteria totaling 10 points, evaluating both final answers and intermediate reasoning steps. A score of 7 out of 10 or higher is considered correct.
Responses are evaluated using a model-based grader (GPT-5). While human expert grading would be ideal, OpenAI said it is not scalable at this level, so rubrics were designed to be reliably checked by a model-based system, supported by a verification pipeline.
How leading AI models performed
OpenAI evaluated several frontier AI models on FrontierScience, including GPT-5.2, Claude Opus 4.5, Gemini 3 Pro, GPT-4o, OpenAI o4-mini and OpenAI o3.In the initial results:
-GPT-5.2 scored 77% on FrontierScience-Olympiad
-GPT-5.2 scored 25% on FrontierScience-Research
-Gemini 3 Pro closely matched GPT-5.2 on the Olympiad track with a 76% score
OpenAI said the results show substantial progress in expert-level reasoning, while leaving significant headroom for improvement, particularly on open-ended research tasks.
Strengths, limits and next steps
While FrontierScience represents a step forward in evaluating scientific reasoning, OpenAI acknowledged key limitations. The benchmark focuses on constrained, expert-written problems and does not fully capture how science is conducted in practice.In particular, it does not assess how models generate genuinely novel hypotheses, work with experimental systems, or interact with multimodal data such as video and physical-world experiments.
Looking ahead, OpenAI said progress in scientific reasoning will come from both stronger general-purpose reasoning systems and targeted improvements in scientific capabilities. FrontierScience is one tool among many, and the company plans to expand the benchmark to new domains and pair it with real-world evaluations.
Ultimately, OpenAI said, the most important measure of AI's scientific value will be the new discoveries it helps generate—and FrontierScience is designed to serve as an early indicator of that potential.
Key takeaways:
-OpenAI launched FrontierScience to test AI on expert-level scientific reasoning across physics, chemistry and biology.-Focus is on reasoning, not recall, including hypothesis generation, testing and cross-disciplinary thinking.
-AI models like GPT-5 are already accelerating research, cutting tasks from weeks to hours.
-Existing science benchmarks are no longer sufficient, prompting the need for harder, expert-written evaluations.
-FrontierScience has two tracks: Olympiad (theoretical reasoning) and Research (real-world, multi-step tasks).
-GPT-5.2 leads performance, scoring 77% on Olympiad tasks and 25% on Research tasks.
Source: LiveMint
Related Posts: OpenAI introduces IndQA benchmark to evaluate AI systems on Indian culture Reel Awards 2026 Jury Meet Sets The Benchmark For Creative Excellence At Baglami Let us make excellence our benchmark Rybakina’s jaw-dropping payday sets benchmark even Alcaraz IIT Madras launches National Employability Benchmark Initiative ‘NIPTA’ to standardise internship and job readiness across India EV supply chain data firm Benchmark Mineral trims workforce OnePlus 15 Design Teased With New Camera Module Ahead Of 2025 Launch Samsung Galaxy Buds 3 FE launched with new design New-Gen Hyundai i20 Spotted Testing In Europe - Big Design Changes Explained iPhone 18 Series Unlikely To Come With New Design Surprises
New Renault Duster: Renault has launched the all-new Duster in India at a starting price of Rs 10.49 lakh. The top-end variant goes up to Rs 18.49 lakh. All prices are ex-showroom, Delhi. 2026 Renault Duster: Renault has launched the all-new Duster in India at a starting price of Rs 10.49 lakh
1 hours ago
Sourav Ganguly joins TMC rumours: Sourav Ganguly has officially quashed rumours of his political entry ahead of West Bengal's upcoming Assembly elections. His office stated these claims are entirely baseless. Meanwhile, the Election Commission of India is implementing robust
1 hours ago
Mumbai man says plumber earns ₹18 LPA, owns house and car: 'Contemplating my life choices'The plumber explained how he earns from multiple societies where he regularly handles plumbing work. Updated on: Mar 17, 2026 5:48 PM IST By Bhavya Sukheja Share via Copy link A Reddit post claiming that a
1 hours ago
Former MP KC Tyagi quits JD(U)His exit comes amid shifting political currents within the JD(U) and its alliances, and is being viewed in political circles as an indication of possible churn within the party’s senior leadership Published on: Mar 17, 2026 6:01 PM IST By Sanjeev K Jha
1 hours ago
Naseeruddin Shah and Om Puri shared a decades-long friendship and creative partnership, marked by unwavering loyalty, mutual respect, and a shared passion for authentic, groundbreaking cinema. In the memoirs of Indian cinema, few friendships have been as legendary as that of Naseeruddin Shah and
1 hours ago
The controversy surrounding the song ‘Sarke Chunar Teri Sarke’ from the Kannada movie KD: The Devil, featuring Nora Fatehi and Sanjay Dutt, continues to intensify. After the song’s lyrical video went viral on social media, many netizens and celebrities slammed the song’s ‘vulgar’ lyrics
1 hours ago
West Bengal Chief Minister Mamata Banerjee will contest from Bhabanipur. She faces BJP leader Suvendu Adhikari in a key electoral battle. The Trinamool Congress has released its candidate list for all 294 seats. The upcoming elections are crucial for Banerjee's political future
1 hours ago
A number of festivals are lined up this week, including Eid-ul-Fitr, Chaitra Navratri and Gudi Padwa. As Muslims and Hindus alike look forward to these auspicious festivals, preparations are in full swing to mark the celebrations this year. This week’s festivities begin with Gudi Padwa and
1 hours ago
Joan Laporta was reelected as the president of FC Barcelona for another five years after winning a leadership vote among members of the Spanish soccer powerhouse. Shortly after midnight in Barcelona, the club announced Laporta had won. His only rival, Víctor Font
1 hours ago
The Odisha Congress on Tuesday suspended three of its MLAs for defying the party whip during the recently held Rajya Sabha elections, in a move aimed at enforcing discipline within the party. The suspended legislators — Ramesh Jena, Sofia Firdous and Dasarathi Gamang — were accused of
1 hours ago
US President Donald Trump was cautioned ahead of military action against Iran that Tehran could retaliate against American allies in the Gulf, according to officials familiar with US intelligence assessments. Sources said pre-war briefings had identified Iranian retaliation as a likely scenario
1 hours ago
Akshay Kumar opens up about spooky incident at home: ‘My son pointed at nothing and asked someone to leave…’As he returns to the horror-comedy genre, Akshay Kumar spoke about an eerie experience that once left him shaken Published on: Mar 17
1 hours ago
“Micro dreaming games" are trending on social media, with many claiming they help anxious minds and people with insomnia fall asleep faster. The idea is gaining popularity largely because of how simple it sounds. For anyone who has spent hours staring at the ceiling
1 hours ago
FM Sitharaman’s remarks come amid broader discussions on financial sector health, where overall MSME NPAs remain lower at around 3.6 per cent as of March 2025, compared to Mudra-specific figures. New Delhi: Finance Minister Nirmala Sitharaman has reaffirmed the transformative role of the Pradhan
1 hours ago
The liver is one of the most vital organs in the human body. From digesting the food we eat to filtering toxins from the blood, it performs hundreds of essential functions that keep the body running smoothly. However, modern lifestyles and unhealthy eating habits have led to a rise in serious
1 hours ago
Vivo T5x 5G Price In India: The smartphone supports 5G, Bluetooth 5.4, Wi-Fi, GPS, BeiDou, GLONASS, Galileo, QZSS, and a USB Type-C port. The phone also includes sensors like an accelerometer, ambient light sensor, e-compass, gyroscope, and proximity sensor
1 hours ago
Mustard oil is one of those kitchen staples you rarely question. It sits on the shelf, gets heated till it smokes, and finds its way into everything from tadkas to pickles. But lately, there’s been a growing unease around what exactly is going into that bottle
1 hours ago
The National Investigation Agency has arrested seven foreign citizens on charges of conspiring to carry out terrorist activities against India, ThePrint reported on Monday. While six of them are Ukrainians, one is from the United States, The Indian Express reported
1 hours ago
As West Bengal gears up for assembly elections, the Trinamool Congress announced its candidate list, a critical exercise to maintain dominance after 15 years. The party focused on balancing experience with youth, analyzing local equations and feedback to counter anti-incumbency and ensure
1 hours ago
Eid is a time of celebration, togetherness, and sharing delicious food with loved ones. This festive season, add a special touch to your Eid table with flavorful and nourishing recipes made with nuts, offering the perfect balance of taste, tradition, and festive indulgence
1 hours ago
China announced emergency humanitarian assistance for Iran, Jordan, Lebanon, and Iraq, aiming to alleviate the severe humanitarian crisis in the region. The foreign ministry expressed deep sympathy for affected populations, emphasizing the ongoing conflict's devastating impact
1 hours ago
The National Investigation Agency (NIA) had arrested seven foreign nationals, six Ukrainians and one American for allegedly providing terrorist training to insurgent groups in Myanmar to carry out attacks in northeast India. The National Investigation Agency (NIA) had a major breakthrough on March
1 hours ago
If you love Korean cuisine, these K-dramas are a must-watchIf you love to savour Korean cuisines, these K-dramas featuring food as the main character are definitely a must-watch. Updated on: Mar 17, 2026 5:14 PM IST By Anukriti Srivastava Share via Copy link Korean dramas have taken the world by
1 hours ago
A 5.8-magnitude earthquake struck off the coast of Cuba early on Tuesday (March 17, 2026), the United States Geological Survey (USGS) said. The quake hit at a depth of 11.6 kilometers (7.2 miles) about 49 kilometers south-southwest of Cuba's easternmost municipality of Maisi at 12:28 a.m.
1 hours ago
In the third week of the joint US-Israeli war against Iran, Donald Trump faces decisions that could define the rest of his presidency. But if the American commander-in-chief is grappling with a war of choice that seems in danger of spiralling in ways he can't control
1 hours ago
‘I want Nayanthara’: AIADMK MP remarks while criticising Stalin at women safety rally stokes controversyWhile participating in the party's protest against the DMK dispensation in Villupuram , Shanmugam, a former state minister remarked "I want Nayanthara''. Published on: Mar 17
1 hours ago
All Indian Cine Workers Association demands ban on Nora Fatehi song Sarke Chunar, FIR against producersThe All Indian Cine Workers Association has said that songs such as Sarke Chunar with vulgar and double-meaning content is unacceptable in Indian cinema. Mar 17, 2026
1 hours ago