Sonnet 5 becomes default, enabling autonomous handling of complex tasks Easily automate multi-step daily tasks at lower cost Make Claude easier to deploy through AWS Fable 5 access is returning to users Keep research tools in one place and move faster Gemini Spark now handles more Mac busywork Choose GPT-5.6 Sol for strong performance on security tasks GPT-5.6 Sol boosts efficiency for long-horizon security tasks Unifies Gemini’s dev entry point for faster prototyping Tag Claude in Slack to delegate tasks with your whole team Slack users can hand work off to Claude more easily Confidential AI gets stronger for sensitive workloads Gemini API key management is moving to safer auth keys Google Home Speaker makes home control feel natural Claude expands more easily into Korean businesses and research Anthropic expands Claude adoption and research in Korea Domain knowledge helps intermediate users succeed with Claude Code Easier to predict model behavior using real deployment data beforehand Google makes data analysis easier through conversation Google expands Gemini for Home for developers Sonnet 5 becomes default, enabling autonomous handling of complex tasks Easily automate multi-step daily tasks at lower cost Make Claude easier to deploy through AWS Fable 5 access is returning to users Keep research tools in one place and move faster Gemini Spark now handles more Mac busywork Choose GPT-5.6 Sol for strong performance on security tasks GPT-5.6 Sol boosts efficiency for long-horizon security tasks Unifies Gemini’s dev entry point for faster prototyping Tag Claude in Slack to delegate tasks with your whole team Slack users can hand work off to Claude more easily Confidential AI gets stronger for sensitive workloads Gemini API key management is moving to safer auth keys Google Home Speaker makes home control feel natural Claude expands more easily into Korean businesses and research Anthropic expands Claude adoption and research in Korea Domain knowledge helps intermediate users succeed with Claude Code Easier to predict model behavior using real deployment data beforehand Google makes data analysis easier through conversation Google expands Gemini for Home for developers

Official sources only. Rumors, leaks, and get-rich schemes are excluded.

← Back to top

AI BriefingOpenAIFeature Updates17:10

AI summarized from verified sources

AI agents can now handle ambiguous judgments in biological data analysis

Researchers can delegate biological data analysis workflows to AI more easily

SOURCE CHECK

1 sources

VERIFIED

Sources

Primary / openai.com

Official Blog

Key Points

1129 synthetic problems replicate real ambiguity
2GPT-5.6 Sol reaches 31.5% (Pro mode)
310 questions open-sourced on Hugging Face

OpenAI introduced GeneBench-Pro, a research-level benchmark with 129 problems across 10 domains like genomics and clinical genetics. It tests AI agents on choosing analysis paths and making judgments from messy data. GPT-5.6 Sol achieved up to 31.5% accuracy, helping with tasks that take human experts 20-40 hours.

What happened

OpenAI announced GeneBench-Pro on June 30. It is a benchmark measuring high-level judgment in computational biology research, evaluating 129 problems from data exploration to final decisions.

Impact

AI support for scientific research advances, potentially reducing researcher workload. Full replacement of human experts is still difficult, but partial automation can save time and costs.