The benchmark uses 750 expert-authored tasks to assess scientific reasoning, data interpretation and research decisions across life science workflows. LifeSciBench tests how AI systems perform across applied life science research tasks, including analysis, experimental design and evidence handling. OpenAI has introduced LifeSciBench, a benchmark designed to test whether AI systems can handle research tasks used in drug discovery and life sciences, rather than only answer structured biology quest

OpenAI puts AI research capabilities to the test with LifeSciBench
Emma Thompson
