Google DeepMind expands AI safety partnership with Singapore Anthropic finds over 10,000 vulnerabilities with Project Glasswing SynthID expands to Google Search and Chrome Anthropic updates vuln disclosure dashboard Goal mode now available across all Codex platforms Codex Thursday adds remote Mac control Anthropic shares early Glasswing results Anthropic publishes early Project Glasswing results Releases new science-focused AI skills tool Gemini 3.5 Flash released with enhanced research tools Google ships ADK for Android/Kotlin v0.1.0 Google launches ADK for Kotlin and Android 0.1.0 Google expands Gemini for Home for developers Gemini 3.5 Flash officially launched AI solves long-standing open math problem for first time Google announces Gemini Omni for video creation Use multiple agents with Gemini Omni OpenAI Introduces Guaranteed Capacity for Long-Term Compute Gemini for Science assists with research tasks SynthID watermark and verification tool added to AI images Google DeepMind expands AI safety partnership with Singapore Anthropic finds over 10,000 vulnerabilities with Project Glasswing SynthID expands to Google Search and Chrome Anthropic updates vuln disclosure dashboard Goal mode now available across all Codex platforms Codex Thursday adds remote Mac control Anthropic shares early Glasswing results Anthropic publishes early Project Glasswing results Releases new science-focused AI skills tool Gemini 3.5 Flash released with enhanced research tools Google ships ADK for Android/Kotlin v0.1.0 Google launches ADK for Kotlin and Android 0.1.0 Google expands Gemini for Home for developers Gemini 3.5 Flash officially launched AI solves long-standing open math problem for first time Google announces Gemini Omni for video creation Use multiple agents with Gemini Omni OpenAI Introduces Guaranteed Capacity for Long-Term Compute Gemini for Science assists with research tasks SynthID watermark and verification tool added to AI images

🔒 公式発表のみ掲載。噂・リーク・情報商材は載せません。

← Back to top

Google19:00Feature UpdatesOfficial Blog

Cost-Effective LLM Serving with Ollama on GKE GPU Sharing

Share GPUs to cut LLM serving costs and simplify ops.

Key Points

1Auto-scale with GKE Autopilot.
2Lightweight serving via Ollama.
3Tenant isolation with vCluster.
4Maximize resources with GPU sharing.

Google Cloud introduced combining GKE Autopilot, Ollama, vCluster, and GPU sharing to solve GPU bottlenecks and costs. This enables efficient multi-tenant LLM serving. Developers can deploy AI models more affordably and scalably.

📎 Source: Official Blog