# Evaluating LLMs' divergent thinking capabilities for scientific idea generation with minimal context

_Friday, June 26, 2026 at 8:25 PM EDT · science · Latest · Tier 2 — Notable_

A new benchmark called LiveIdeaBench has been introduced to evaluate large language models on scientific idea generation from single keyword prompts. The benchmark measures divergent thinking capabilities and rates generated ideas on five dimensions: originality, feasibility, fluency, flexibility and clarity. It draws from Guilford's creativity theory and was applied to more than 40 models across 1180 keywords in 22 scientific domains.

Standard metrics of general intelligence showed poor alignment with performance on the benchmark. The model QwQ-32B-preview generated ideas comparable to those from claude-3.7-sonnet despite differences in their general intelligence scores. Existing benchmarks for language models in scientific tasks have relied primarily on rich contextual inputs rather than minimal prompts.

The results point to the need for specialized evaluation methods for scientific idea generation. Authors note that enhancing these capabilities may require training strategies different from those used to improve general problem solving abilities. The work references prior research on human creativity that separates divergent thinking from convergent thinking.

## Sources

- [Nature Communications](https://www.nature.com/articles/s41467-026-70245-1)

---
Canonical: https://techandbusiness.org/newswire/WMYow9Ig064KslncDONOta
Retrieved: 2026-06-27T04:59:01.826Z
Publisher: Tech & Business (techandbusiness.org)
