Codex now controls Windows PCs directlyOpenAI launches Rosalind Biodefense initiativeAnthropic raises $65B in Series H fundingAnthropic raises $65B in Series HClaude Opus 4.8 Now Available on Web, Platform and CloudClaude Opus 4.8 now available on web and APIAnthropic releases Claude Opus 4.8 with faster workflowsDynamic Workflows Added to Claude Code in Research PreviewGemini Omni enables conversational content editingOpenAI publishes 2026 election safeguardsSynthID Watermarking Expanded with OpenAI PartnershipAnthropic updates Responsible Scaling Policy v3.2OpenAI updates ChatGPT ad policy criteriaAnthropic explains how it contains ClaudeGoogle DeepMind expands AI safety partnership with SingaporeAnthropic finds over 10,000 vulnerabilities with Project GlasswingSynthID expands to Google Search and ChromeAnthropic updates vuln disclosure dashboardAnthropic shares early Glasswing resultsAnthropic publishes early Project Glasswing resultsCodex now controls Windows PCs directlyOpenAI launches Rosalind Biodefense initiativeAnthropic raises $65B in Series H fundingAnthropic raises $65B in Series HClaude Opus 4.8 Now Available on Web, Platform and CloudClaude Opus 4.8 now available on web and APIAnthropic releases Claude Opus 4.8 with faster workflowsDynamic Workflows Added to Claude Code in Research PreviewGemini Omni enables conversational content editingOpenAI publishes 2026 election safeguardsSynthID Watermarking Expanded with OpenAI PartnershipAnthropic updates Responsible Scaling Policy v3.2OpenAI updates ChatGPT ad policy criteriaAnthropic explains how it contains ClaudeGoogle DeepMind expands AI safety partnership with SingaporeAnthropic finds over 10,000 vulnerabilities with Project GlasswingSynthID expands to Google Search and ChromeAnthropic updates vuln disclosure dashboardAnthropic shares early Glasswing resultsAnthropic publishes early Project Glasswing results
🔒 公式発表のみ掲載。噂・リーク・情報商材は載せません。
← Back to top
Anthropic19:39Press ReleasesOfficial Blog

Anthropic Builds Auto Alignment Researchers, 97% Gap Closure

AI automates safety research, slashing human effort dramatically.

Key Points

  • 197% gap recovery in supervision
  • 24x faster than humans
  • 3Generalizes to coding/math
  • 4Highlights reward hacking risks

Anthropic developed Automated Alignment Researchers using Claude Opus 4.6, closing 97% of weak-to-strong supervision gap vs humans' 23%. Nine parallel AARs accelerated experiments. Methods generalized to coding/math tasks, boosting alignment research efficiency.

h
hayami

Stay on top of OpenAI, Google & Anthropic updates. An AI digest for business professionals.

Source Policy

We use only official sources. Each article links to the original announcement so you can verify it yourself.

© 2026 hayami. All rights reserved.