GPT-5.5 Instant beats doctors on health answers

OpenAI has unveiled GPT-5.5 Instant, a new free model the company says writes better health answers than physicians. After two months of evaluation by 260 doctors across 60 countries, the model cuts incorrect medical statements by 71% compared to the previous version.

Key Takeaways

GPT-5.5 Instant launched June 18 by OpenAI, free for all ChatGPT users
Over 260 doctors from 60 countries reviewed more than 700,000 responses
71% drop in incorrect health statements over two months of testing

A benchmark calibrated by 260 doctors

On June 18, OpenAI announced GPT-5.5 Instant, a new model rolled out to every free ChatGPT user. The company frames it as specifically reinforced on medical ground, with an evaluation protocol built alongside clinicians.

The setup relies on a network of more than 260 doctors spread across 60 countries. These physicians reviewed over 700,000 responses produced by the model, comparing them with responses written by other doctors and with answers generated by GPT-4o.

The benchmarks used are HealthBench and HealthBench Professional, two evaluation sets developed by OpenAI but clinically validated. On these tests, GPT-5.5 Instant reaches 89.9% on complex instruction-following.

Across the five medical evaluation categories used, the model outperforms answers written by human doctors on the full set. This is the first time OpenAI openly claims average superiority over practitioners on its own tests.

71% fewer health errors in two months

The core number is a single line. Over two months of measurement, GPT-5.5 Instant produces 71% fewer incorrect health statements than the prior model version. The drop was measured on calibrated clinical prompts, not on open consumer questions.

For OpenAI, this gain carries a direct economic argument. The model fits a logic of reliable answers at lower cost, where pricier “Thinking” models had been the reference on technical subjects. With GPT-5.5 Instant free, health accuracy becomes available without a paywall.

The performance climb lands as OpenAI has to prove product value before going public, in a context where its annual losses now exceed $34 billion. Health is a use case the company has been pushing since Fidji Simo joined.

The model is available immediately to every free ChatGPT user, with usage caps. Paid tiers access it without additional restrictions.

Also on Horizon:

A fast tilt for 230 million users

According to figures shared by OpenAI, more than 230 million people use ChatGPT every week for health-related questions. That is the audience where the precision gains of GPT-5.5 Instant land immediately.

In the short term, this update reshapes ground for healthcare professionals. Patients who walk into appointments with a ChatGPT-shaped opinion will now arrive with a statistically more accurate one. The bar rises on the human response expected in return.

In the medium term, the scope question is open. OpenAI does not claim a diagnostic use, only an informational response, keeping the company off therapeutic advice ground. The boundary is legal before it is technical, and regulators will be the ones refining it.

Direct competitors (Anthropic, Google, Meta) have not yet communicated an equivalent model calibrated on HealthBench. The topic is set to become a central comparison point in the race for consumer-facing models.

Follow the story on Horizon.

Post Views: 143

GPT-5.5 Instant beats doctors on health answers

Key Takeaways

A benchmark calibrated by 260 doctors

71% fewer health errors in two months

A fast tilt for 230 million users

1 Comment

Leave a Reply Cancel reply