Nature Medicine, Published online: 17 June 2026; doi:10.1038/s41591-026-04457-9 Specialized clinical AI tools are entering medical practice with little independent testing. In a head-to-head evaluation across two public benchmarks and real questions from physicians, three general-purpose frontier large language models outperformed two leading clinical AI tools, which performed no better than Google search AI overview.
