Microsoft claims its AI tool can diagnose complex cases better than doctors

Microsoft claims that it has developed an AI tool that, in a recent experiment, diagnosed patients with four times more accuracy than human doctors.
The technology, called MAI Diagnostic Orchestrator (MAI-DxO), works by combining multiple advanced AI models, including ChatGPT, Google’s Gemini, Anthropic’s Claude, Meta’s Llama, and xAI’s Grok. The system mimics a team of doctors working together, sharing opinions and debating symptoms before reaching a diagnosis.
To test the system, researchers used 304 real-life case studies published in the New England Journal of Medicine. These were turned into a series of patient scenarios, where the AI had to figure out the illness just as a doctor would by analysing symptoms, ordering tests, and narrowing down possibilities step-by-step.
The results were surprising: the AI system correctly diagnosed 80% of the cases, compared to just 20% by a group of human doctors. And it wasn’t just about accuracy. The AI also managed to lower the cost of diagnosis by about 20%, by choosing more affordable tests and avoiding unnecessary procedures.
“This is a genuine step toward medical superintelligence,” said Mustafa Suleyman, CEO of Microsoft AI, highlighting the tool’s potential to transform healthcare decision-making.
According to the company, as demand for healthcare continues to grow, costs are rising at an unsustainable pace, and billions of people face multiple barriers to better health, including inaccurate and delayed diagnoses.
While AI has already been used to help doctors interpret medical scans, this latest development suggests it could take on broader diagnostic roles, possibly becoming a first point of contact for patients in the future.
Experts involved in the project say this could help reduce healthcare costs and speed up access to care. “Our model performs incredibly well—both getting to the diagnosis and doing so cost-effectively,” said Dominic King, a Microsoft vice president.
However, like all AI systems, these tools must be carefully monitored. There are concerns about whether they work equally well across different populations, since much of the training data may be skewed toward certain groups.
“This research is just the first step on a long, exciting journey. We’re excited to keep testing and learning with our healthcare partners in pursuit of better, more accessible care for people everywhere,” Suleyman said.

Be the first to comment

Leave a Reply

Your email address will not be published.