GlossaryUnderstand AI

AI Glossary

A glossary of key AI terms to help you understand the latest news and updates.

100 terms

A

Alignment

Alignment is the effort to adjust a model's outputs to match human intent and safety standards while reducing undesirable behavior. It is foundational for deploying AI systems safely.

Agent

エージェント

An agent is an AI execution pattern that plans steps toward a goal, uses tools, and iteratively reviews results to complete tasks. It goes beyond single-turn Q&A into multi-step automation.

Agent Memory

エージェントメモリ

Agent memory is a mechanism that stores important information from conversations or work so it can be reused in later tasks. It helps agents stay consistent across sessions and long workflows.

Automatic Speech Recognition (ASR)

音声認識

Automatic speech recognition is a technology that converts audio speech into text, used for meeting transcription and subtitle generation. It is a key component for voice-based AI experiences.

Autoregressive Model

自己回帰モデル

An autoregressive model predicts the next token step by step from the preceding context and generates text sequentially. Knowing this helps explain why generation depends strongly on earlier tokens.

Automated Evaluation

自動評価

Automated evaluation is a method that measures output quality mechanically using rules or other models to speed up iteration cycles. It scales well but must be checked against human judgment to avoid drift.

Attention Mechanism

注意機構

An attention mechanism is a method that assigns weights to parts of the input to focus on what matters and incorporate the needed information into the output. Understanding it makes model behavior easier to reason about.

Agent Development Kit (ADK)

Agent Development Kit（ADK）

An Agent Development Kit is a toolkit for designing, building, evaluating, and deploying AI agents. The term usually refers to application infrastructure around models, tools, state, and orchestration.

A2A Protocol

A2Aプロトコル

An A2A protocol is a communication approach that lets AI agents exchange tasks, status, and results. It matters when AI systems move from single-agent workflows to coordinated multi-agent systems.

AI Watermarking

AI透かし

AI watermarking embeds detectable signals into AI-generated or AI-edited content so that its origin can be identified later. It is one approach to disclosure and provenance, but not a complete solution by itself.

B

Benchmark

ベンチマーク

A benchmark is a standardized set of tasks and evaluation procedures used to compare multiple models under the same conditions. It helps quantify performance but should be complemented with real-world tests.

Browser Automation

ブラウザ自動化

Browser automation allows software or AI agents to open web pages, click controls, enter text, and extract information. It is a practical bridge between language models and web-based workflows.

C

Continued Pretraining

継続事前学習

Continued pretraining is running pretraining again on a pretrained model using additional data to expand its knowledge and vocabulary. It is often used to strengthen domain or freshness coverage.

Constitutional AI

憲法AI

Constitutional AI is a training approach that uses a predefined set of principles to guide self-critique and self-improvement toward safer, more consistent behavior. It aims to make safety goals more explicit and repeatable.

Context Compression

コンテキスト圧縮

Context compression is the practice of shrinking long inputs into key points or necessary information to save tokens while maintaining accuracy. It is important for balancing quality, latency, and cost.

Context Window

コンテキストウィンドウ

A context window is the total amount of input and output context a model can consider at once, measured in tokens. Understanding it helps prevent truncation and manage long documents.

Context Length

コンテキスト長

Context length refers to the token limit of a model's context window and is directly tied to its long-document handling capability. It also influences latency and cost as prompts grow.

Chain-of-Thought (CoT)

思考の連鎖

Chain-of-thought is a prompting technique that encourages a model to articulate intermediate reasoning steps, not just the final answer, to improve performance on multi-step problems. It can raise accuracy but may increase output length and risk.

Chunking

チャンク分割

Chunking is a preprocessing step that splits long documents into appropriately sized pieces for search and RAG. Good chunking balances context preservation with retrieval precision.

Computer Use

Computer use is the ability for an AI system to inspect a screen and perform actions such as clicking, typing, scrolling, and navigating software. It is a foundation for agents that operate existing applications.

Coding Agent

コーディングエージェント

A coding agent is an AI agent that can inspect a codebase, edit files, run tests, and iterate on fixes. It extends AI coding beyond autocomplete into multi-step software engineering workflows.

Content Credentials

Content Credentials are metadata-based signals that help viewers inspect how media was created or edited. They are important for provenance, disclosure, and trust in AI-generated or AI-edited content.

C2PA

C2PA is a technical standard for recording and verifying the provenance of digital content. It provides a foundation for systems such as Content Credentials and authenticity metadata.

Capability Evaluation

能力評価

Capability evaluation measures what an AI model can do and how reliably it can do it, using benchmarks, expert tests, red teaming, and task-specific evaluations. It informs both product claims and safety decisions.

D

Diffusion Model

拡散モデル

A diffusion model is a generative model that learns to reconstruct images by progressively removing noise, enabling high-quality image generation. Understanding it helps explain quality, speed, and controllability in modern generators.

Differential Privacy

差分プライバシー

Differential privacy is an approach that adds noise so the influence of any single individual's data is hard to infer, protecting privacy in training or aggregation. It provides a mathematical framework for privacy guarantees.

DPO (Direct Preference Optimization)

DPO

DPO is a training method that directly optimizes a model using preference comparison data without running a reinforcement learning loop. It simplifies preference-based alignment while retaining technical correctness.

Data Leakage

データ漏えい

Data leakage is the risk that confidential or personal information is unintentionally exposed externally or can be reconstructed from training or outputs. Preventing it requires layered controls across input, storage, and output.

E

Embedding

埋め込み

An embedding is a numeric vector representation of text or images where semantically similar items are placed closer together. It is widely used for search, clustering, and classification.

F

Function Calling

関数呼び出し

Function calling is a mechanism where a model outputs a function name and arguments to request an external operation, enabling safer and more reliable execution by the application. It is key for connecting LLMs to tools and systems.

Few-shot Prompting

少数例プロンプティング

Few-shot prompting is a technique that includes a small number of input–output examples to teach the model the desired format and decision criteria. It often improves consistency without fine-tuning.

Fine-tuning

ファインチューニング

Fine-tuning is a method that further trains a pretrained model on task- or domain-specific data to optimize its behavior for a particular use. It helps tailor outputs beyond what prompts alone can achieve.

Frontier Model

フロンティアモデル

A frontier model is an AI model near the leading edge of current capabilities and broad enough to affect many tasks. News about frontier models often combines performance, deployment, and safety considerations.

G

Guardrails

ガードレール

Guardrails are a set of safety measures that detect prohibited content and control outputs to prevent dangerous or inappropriate responses. They are a core concept for safe AI deployment.

Grounding

グラウンディング

Grounding is the practice of tying an answer to specific evidence so it is based on references rather than guesses. It is commonly achieved via retrieval, citations, and tool results.

H

Hallucination

ハルシネーション

A hallucination is a phenomenon where a model generates plausible-sounding content that is factually incorrect. Understanding it helps you design mitigations like retrieval, citations, and refusals.

Human Evaluation

人手評価

Human evaluation is an assessment method where people read model outputs and judge quality using criteria such as accuracy and usefulness. It captures aspects that automated metrics often miss.

I

Image Generation

画像生成

Image generation is a technique that creates new images from text instructions or other conditions, often used for design drafts and rapid prototyping. It is important for building creative and visual user experiences.

Image Understanding

画像理解

Image understanding is an AI capability that analyzes images to recognize objects, text, and relationships, then uses that understanding for description or classification. It is essential for automating workflows involving visual data.

Instruction Tuning

指示チューニング

Instruction tuning is additional training on pairs of instructions and desired answers to make a model follow instructions more reliably. It improves usability for general users and real tasks.

Input Token

入力トークン

Input tokens are the tokens contained in the prompt and context you send to a model, and they determine limits and part of the cost. They are a basic unit for estimating API usage fees.

K

KV Cache

KVキャッシュ

KV cache is an inference optimization that stores intermediate key/value states from attention during generation to speed up next-token computation. It is important for reducing latency in long outputs.

L

Large Language Model (LLM)

大規模言語モデル

A large language model is an AI model trained on massive text to learn relationships between words and generate outputs such as dialogue and summaries. Understanding how it works helps you use it appropriately.

Latency

レイテンシ

Latency is the delay from sending a request to receiving a response, directly affecting perceived speed and operating cost. It is a critical metric for production user experience.

LoRA (Low-Rank Adaptation)

LoRA

LoRA is a parameter-efficient adaptation method that learns only small low-rank matrices instead of updating all model weights. It enables fine-tuning with much less compute and memory.

M

Mixture of Experts (MoE)

混合専門家

A mixture of experts is a model design that prepares multiple specialized sub-models and computes only a selected subset depending on the input. It can scale capacity while controlling inference cost.

Multimodal

マルチモーダル

Multimodal refers to an AI capability to handle multiple types of information—such as text, images, and audio—within a single system. It enables richer inputs and outputs than text-only models.

Model Distillation

モデル蒸留

Model distillation is a compression technique that trains a smaller student model using outputs from a larger teacher model to transfer knowledge. It helps reduce latency and cost while keeping quality.

MCP (Model Context Protocol)

MCP

MCP, or Model Context Protocol, is a protocol for standardizing how LLM applications connect to external tools and data sources. It is important for understanding tool ecosystems around AI agents.

O

Output Token

出力トークン

Output tokens are the tokens in the text a model generates, used to calculate cost and output limits. They are a basic unit for estimating generation fees.

P

Pretraining

事前学習

Pretraining is a large-scale training stage where a model learns general language patterns and knowledge from massive unlabeled text. It forms the foundation for later adaptation methods.

Pay-as-you-go Pricing

従量課金

Pay-as-you-go pricing is a billing model where costs increase or decrease with usage, often based on tokens or processing volume for AI services. It is fundamental for estimating AI operating expenses.

Planning

プランニング

Planning is an agent step that decomposes a goal into actionable sub-steps and organizes execution order and required information. It helps reduce omissions and wasted work.

Prompt

プロンプト

A prompt is the input text that communicates your goal and constraints to the model and strongly influences output quality. Understanding prompts helps you control results more reliably.

Prompt Injection

プロンプトインジェクション

Prompt injection is an attack that uses malicious instructions to override system rules and trigger unintended actions or information leakage. It is a major risk for tool-using or retrieval-augmented systems.

Prompt Template

プロンプトテンプレート

A prompt template is a structured prompt pattern with placeholders that can be filled with variables to produce stable inputs and outputs. It is commonly used in production systems to reduce variability.

Pricing Tier

料金ティア

A pricing tier is a plan category with predefined limits, unit prices, and features, used to choose the right plan for your needs. It is a core concept for cost planning.

Prompt Caching

プロンプトキャッシング

Prompt caching reuses previously processed prompt content to reduce latency and cost for repeated LLM calls. It is especially useful for long system prompts, RAG context, and agent workflows.

Provenance

プロベナンス

Provenance is information about where content or data came from, how it was edited, and how it moved through a workflow. In AI news, it is a core concept for trust, authenticity, and accountability.

Q

Quantization

量子化

Quantization is a technique that reduces compute and memory by representing model weights with lower bit-width numbers, making inference faster and cheaper. It can trade a small accuracy drop for efficiency gains.

R

RLAIF (Reinforcement Learning from AI Feedback)

RLAIF

RLAIF is a technique that uses AI-based judgments to generate preference data and then applies reinforcement learning to tune a model. It reduces reliance on human raters but requires careful oversight of evaluator bias.

RLHF (Reinforcement Learning from Human Feedback)

RLHF

RLHF is a technique that builds a reward model from human preference judgments and then uses reinforcement learning to align a model's behavior. It is widely used to improve helpfulness and safety.

Reinforcement Learning (RL)

強化学習

Reinforcement learning is a training method that learns a policy to maximize rewards obtained from the outcomes of actions. In LLMs, it can be used to steer behavior toward preferred responses.

Refusal

拒否応答

A refusal is a model behavior that declines requests it cannot comply with for safety or policy reasons, typically with an explanation. It is important for safe and predictable AI operation.

Retrieval-Augmented Generation (RAG)

検索拡張生成

Retrieval-augmented generation is a technique that retrieves supporting passages via external search and uses them to generate answers for higher accuracy. It helps reduce hallucinations and incorporate up-to-date or private knowledge.

Reward Model

報酬モデル

A reward model is a model that scores the quality of responses numerically and is used as an objective signal in methods like RLHF. It helps translate preferences into an optimization target.

Role Prompting

役割付与

Role prompting is a technique that assigns a role or persona (e.g., "You are a reviewer") to encourage answers in a desired tone and perspective. It can improve consistency and relevance.

Retriever

リトリーバー

A retriever is a component that searches for documents or chunks relevant to a question and passes them to an LLM. It is a core part of RAG pipelines.

Reflection

リフレクション

Reflection is a review step where the model or system checks generated results, identifies errors or gaps, and revises the output. It improves reliability by adding a quality-control loop.

Re-ranking

リランキング

Re-ranking is a step that reorders retrieved candidate documents using another model or scoring function to keep the most relevant ones. It improves retrieval quality and controls context size.

Rubric Evaluation

ルーブリック評価

Rubric evaluation is a method that defines evaluation dimensions and scoring criteria in a rubric table so outputs are judged consistently. It improves alignment between evaluators and supports repeatable measurement.

Rate Limit

レート制限

A rate limit is a cap on how many requests or tokens you can send within a time window, set for stability and fairness. It affects throughput planning and practical cost estimation.

Red Teaming

レッドチーミング

Red teaming is adversarial testing that simulates misuse and failure modes to uncover weaknesses in a model or system before deployment. It is a key practice for improving AI safety.

Reasoning Model

推論モデル

A reasoning model is an AI model optimized to spend more computation on multi-step problem solving, such as math, coding, planning, and analysis. It is often discussed separately from general chat models.

Realtime API

A realtime API supports low-latency exchange of audio, text, or other signals with an AI model. It is a foundation for voice agents, live assistants, and interactive multimodal experiences.

Responsible Scaling Policy

A Responsible Scaling Policy is a governance framework that ties stronger AI capabilities to safety evaluations, safeguards, and release decisions. It is commonly discussed in relation to frontier model development.

S

Structured Output

構造化出力

Structured output is an output style that forces responses into a predefined structure so downstream systems can process them reliably. It is important for integrating LLMs into applications and workflows.

System Prompt

システムプロンプト

A system prompt is a high-priority instruction that sets overall policies and rules for a conversation and typically overrides user messages. It helps maintain consistent behavior and safety constraints.

Source Attribution (Citations)

出典引用

Source attribution is a mechanism that explicitly shows the documents or passages used as evidence so readers can verify the answer. It is important for trust, auditing, and safer RAG experiences.

Self-Consistency

自己整合性

Self-consistency is an inference method that runs multiple generations for the same question and aggregates them (e.g., by majority vote) to reduce random errors. It can improve reliability at the cost of extra compute.

Stop Sequence

ストップシーケンス

A stop sequence is a termination condition that stops generation when a specified string appears in the output. It is useful for controlling boundaries in formatted responses.

Streaming Generation

ストリーミング生成

Streaming generation is an output mode that returns tokens progressively instead of sending the full response at once. It improves perceived responsiveness and user experience for long outputs.

Safety Classifier

セーフティ分類器

A safety classifier is a detection model that determines whether inputs or outputs violate safety criteria and serves as a core part of guardrails. It helps reduce incidents by filtering risky content.

SWE-bench

SWE-bench is a benchmark for measuring whether AI systems can resolve real software engineering issues from GitHub repositories. It is often cited when evaluating coding agents.

T

Text-to-Speech (TTS)

音声合成

Text-to-speech is a technology that generates natural-sounding speech from text, used for read-aloud features and conversational assistants. It enables voice output experiences beyond plain text.

Temperature

温度

Temperature is a setting that adjusts the randomness of the next-token probability distribution, changing the balance between creativity and stability. Lower values are more deterministic, higher values more diverse.

Tool Use

ツール利用

Tool use is a setup where a model calls external tools (e.g., search or calculators) and incorporates the results to improve answer accuracy. It is common in practical implementations to reduce hallucinations and mistakes.

Transfer Learning

転移学習

Transfer learning is the idea of reusing knowledge learned from one dataset or task to perform well on another task with less data. It explains why pretrained models can adapt quickly.

Token

トークン

A token is a small unit of text used by models for processing, and it is also used to estimate limits and billing. Understanding tokens is essential for controlling cost and context length.

Token-Based Pricing

トークン課金

Token-based pricing is a pay-as-you-go billing model where fees are determined by the number of input and output tokens. Managing both input and output lengths helps reduce cost.

Top-p Sampling (Nucleus Sampling)

トップPサンプリング

Top-p sampling is a decoding method that keeps the smallest set of high-probability tokens whose cumulative probability reaches p, then samples the next token from that set. It helps control diversity without relying only on temperature.

Transformer

トランスフォーマー

A Transformer is a neural network architecture that captures important relationships in text using attention and can be trained efficiently with parallel computation. Knowing its mechanism helps you understand modern LLMs.

Toxicity

有害性

Toxicity is the degree to which content is harmful—such as discrimination, violence, or harassment—and is targeted by safety evaluation and filtering. It is a key metric for content moderation and guardrails.

Test-time Compute

テスト時コンピュート

Test-time compute means spending additional computation during inference, after training, to improve answer quality. It is central to discussions about reasoning models, search, verification, and agent performance.

V

Vision-Language Model

視覚言語モデル

A vision-language model is a model that jointly understands images and text to describe image content and answer questions about it. Knowing how it works helps you choose the right model for visual tasks.

Vector Search

ベクトル検索

Vector search is a retrieval method that uses distances between embeddings to find semantically similar documents or data. It enables meaning-based search beyond keyword matching.

Vector Database

ベクトルデータベース

A vector database is a data store optimized to save embeddings and quickly retrieve nearby vectors for similarity search. It is a common building block for RAG and semantic search systems.

Voice Agent

音声エージェント

A voice agent combines speech recognition, an LLM, speech synthesis, and tool use to complete tasks through conversation. Low latency, interruption handling, and safe execution are central design concerns.

W

Web Search

Web検索

Web search lets an LLM retrieve information from the web before or during a response. It is important for AI products that need recent facts, citations, and awareness of changing public information.

Z

Zero-shot Prompting

ゼロショットプロンプティング

Zero-shot prompting is a method that asks a model to perform a task using only instructions, without providing examples. It is useful for quick trials and simple tasks.