Sonnet 5 becomes default, enabling autonomous handling of complex tasksEasily automate multi-step daily tasks at lower costMake Claude easier to deploy through AWSFable 5 access is returning to usersKeep research tools in one place and move fasterGemini Spark now handles more Mac busyworkChoose GPT-5.6 Sol for strong performance on security tasksGPT-5.6 Sol boosts efficiency for long-horizon security tasksUnifies Gemini’s dev entry point for faster prototypingTag Claude in Slack to delegate tasks with your whole teamSlack users can hand work off to Claude more easilyConfidential AI gets stronger for sensitive workloadsGemini API key management is moving to safer auth keysGoogle Home Speaker makes home control feel naturalClaude expands more easily into Korean businesses and researchAnthropic expands Claude adoption and research in KoreaDomain knowledge helps intermediate users succeed with Claude CodeEasier to predict model behavior using real deployment data beforehandGoogle makes data analysis easier through conversationGoogle expands Gemini for Home for developersSonnet 5 becomes default, enabling autonomous handling of complex tasksEasily automate multi-step daily tasks at lower costMake Claude easier to deploy through AWSFable 5 access is returning to usersKeep research tools in one place and move fasterGemini Spark now handles more Mac busyworkChoose GPT-5.6 Sol for strong performance on security tasksGPT-5.6 Sol boosts efficiency for long-horizon security tasksUnifies Gemini’s dev entry point for faster prototypingTag Claude in Slack to delegate tasks with your whole teamSlack users can hand work off to Claude more easilyConfidential AI gets stronger for sensitive workloadsGemini API key management is moving to safer auth keysGoogle Home Speaker makes home control feel naturalClaude expands more easily into Korean businesses and researchAnthropic expands Claude adoption and research in KoreaDomain knowledge helps intermediate users succeed with Claude CodeEasier to predict model behavior using real deployment data beforehandGoogle makes data analysis easier through conversationGoogle expands Gemini for Home for developers
Official sources only. Rumors, leaks, and get-rich schemes are excluded.
← Back to top
AI BriefingOpenAIFeature Updates17:10

AI summarized from verified sources

AI agents can now handle ambiguous judgments in biological data analysis

Researchers can delegate biological data analysis workflows to AI more easily

SOURCE CHECK

1 sources

VERIFIED

Sources

Key Points

  • 1129 synthetic problems replicate real ambiguity
  • 2GPT-5.6 Sol reaches 31.5% (Pro mode)
  • 310 questions open-sourced on Hugging Face

OpenAI introduced GeneBench-Pro, a research-level benchmark with 129 problems across 10 domains like genomics and clinical genetics. It tests AI agents on choosing analysis paths and making judgments from messy data. GPT-5.6 Sol achieved up to 31.5% accuracy, helping with tasks that take human experts 20-40 hours.

What happened

OpenAI announced GeneBench-Pro on June 30. It is a benchmark measuring high-level judgment in computational biology research, evaluating 129 problems from data exploration to final decisions.

Impact

AI support for scientific research advances, potentially reducing researcher workload. Full replacement of human experts is still difficult, but partial automation can save time and costs.

h
hayami

Stay on top of OpenAI, Google & Anthropic updates. An AI digest for business professionals.

Source Policy

We use only official sources. Each article links to the original announcement so you can verify it yourself.

© 2026 hayami. All rights reserved.