AI Glossary
A glossary of key AI terms to help you understand the latest news and updates.
80 terms
A
Alignment
アラインメントAlignment is the effort to adjust a model's outputs to match human intent and safety standards while reducing undesirable behavior. It is foundational for deploying AI systems safely.
1 related article
Agent
エージェントAn agent is an AI execution pattern that plans steps toward a goal, uses tools, and iteratively reviews results to complete tasks. It goes beyond single-turn Q&A into multi-step automation.
79 related articles
Agent Memory
エージェントメモリAgent memory is a mechanism that stores important information from conversations or work so it can be reused in later tasks. It helps agents stay consistent across sessions and long workflows.
1 related article
Automatic Speech Recognition (ASR)
音声認識Automatic speech recognition is a technology that converts audio speech into text, used for meeting transcription and subtitle generation. It is a key component for voice-based AI experiences.
Autoregressive Model
自己回帰モデルAn autoregressive model predicts the next token step by step from the preceding context and generates text sequentially. Knowing this helps explain why generation depends strongly on earlier tokens.
Automated Evaluation
自動評価Automated evaluation is a method that measures output quality mechanically using rules or other models to speed up iteration cycles. It scales well but must be checked against human judgment to avoid drift.
Attention Mechanism
注意機構An attention mechanism is a method that assigns weights to parts of the input to focus on what matters and incorporate the needed information into the output. Understanding it makes model behavior easier to reason about.
B
C
Continued Pretraining
継続事前学習Continued pretraining is running pretraining again on a pretrained model using additional data to expand its knowledge and vocabulary. It is often used to strengthen domain or freshness coverage.
Constitutional AI
憲法AIConstitutional AI is a training approach that uses a predefined set of principles to guide self-critique and self-improvement toward safer, more consistent behavior. It aims to make safety goals more explicit and repeatable.
Context Compression
コンテキスト圧縮Context compression is the practice of shrinking long inputs into key points or necessary information to save tokens while maintaining accuracy. It is important for balancing quality, latency, and cost.
Context Window
コンテキストウィンドウA context window is the total amount of input and output context a model can consider at once, measured in tokens. Understanding it helps prevent truncation and manage long documents.
Context Length
コンテキスト長Context length refers to the token limit of a model's context window and is directly tied to its long-document handling capability. It also influences latency and cost as prompts grow.
3 related articles
Chain-of-Thought (CoT)
思考の連鎖Chain-of-thought is a prompting technique that encourages a model to articulate intermediate reasoning steps, not just the final answer, to improve performance on multi-step problems. It can raise accuracy but may increase output length and risk.
1 related article
Chunking
チャンク分割Chunking is a preprocessing step that splits long documents into appropriately sized pieces for search and RAG. Good chunking balances context preservation with retrieval precision.
D
Diffusion Model
拡散モデルA diffusion model is a generative model that learns to reconstruct images by progressively removing noise, enabling high-quality image generation. Understanding it helps explain quality, speed, and controllability in modern generators.
Differential Privacy
差分プライバシーDifferential privacy is an approach that adds noise so the influence of any single individual's data is hard to infer, protecting privacy in training or aggregation. It provides a mathematical framework for privacy guarantees.
DPO (Direct Preference Optimization)
DPODPO is a training method that directly optimizes a model using preference comparison data without running a reinforcement learning loop. It simplifies preference-based alignment while retaining technical correctness.
Data Leakage
データ漏えいData leakage is the risk that confidential or personal information is unintentionally exposed externally or can be reconstructed from training or outputs. Preventing it requires layered controls across input, storage, and output.
E
F
Function Calling
関数呼び出しFunction calling is a mechanism where a model outputs a function name and arguments to request an external operation, enabling safer and more reliable execution by the application. It is key for connecting LLMs to tools and systems.
4 related articles
Few-shot Prompting
少数例プロンプティングFew-shot prompting is a technique that includes a small number of input–output examples to teach the model the desired format and decision criteria. It often improves consistency without fine-tuning.
Fine-tuning
ファインチューニングFine-tuning is a method that further trains a pretrained model on task- or domain-specific data to optimize its behavior for a particular use. It helps tailor outputs beyond what prompts alone can achieve.
2 related articles
G
Guardrails
ガードレールGuardrails are a set of safety measures that detect prohibited content and control outputs to prevent dangerous or inappropriate responses. They are a core concept for safe AI deployment.
3 related articles
Grounding
グラウンディングGrounding is the practice of tying an answer to specific evidence so it is based on references rather than guesses. It is commonly achieved via retrieval, citations, and tool results.
3 related articles
H
Hallucination
ハルシネーションA hallucination is a phenomenon where a model generates plausible-sounding content that is factually incorrect. Understanding it helps you design mitigations like retrieval, citations, and refusals.
Human Evaluation
人手評価Human evaluation is an assessment method where people read model outputs and judge quality using criteria such as accuracy and usefulness. It captures aspects that automated metrics often miss.
I
Image Generation
画像生成Image generation is a technique that creates new images from text instructions or other conditions, often used for design drafts and rapid prototyping. It is important for building creative and visual user experiences.
10 related articles
Image Understanding
画像理解Image understanding is an AI capability that analyzes images to recognize objects, text, and relationships, then uses that understanding for description or classification. It is essential for automating workflows involving visual data.
Instruction Tuning
指示チューニングInstruction tuning is additional training on pairs of instructions and desired answers to make a model follow instructions more reliably. It improves usability for general users and real tasks.
Input Token
入力トークンInput tokens are the tokens contained in the prompt and context you send to a model, and they determine limits and part of the cost. They are a basic unit for estimating API usage fees.
1 related article
K
L
Large Language Model (LLM)
大規模言語モデルA large language model is an AI model trained on massive text to learn relationships between words and generate outputs such as dialogue and summaries. Understanding how it works helps you use it appropriately.
3 related articles
Latency
レイテンシLatency is the delay from sending a request to receiving a response, directly affecting perceived speed and operating cost. It is a critical metric for production user experience.
4 related articles
LoRA (Low-Rank Adaptation)
LoRALoRA is a parameter-efficient adaptation method that learns only small low-rank matrices instead of updating all model weights. It enables fine-tuning with much less compute and memory.
M
Mixture of Experts (MoE)
混合専門家A mixture of experts is a model design that prepares multiple specialized sub-models and computes only a selected subset depending on the input. It can scale capacity while controlling inference cost.
1 related article
Multimodal
マルチモーダルMultimodal refers to an AI capability to handle multiple types of information—such as text, images, and audio—within a single system. It enables richer inputs and outputs than text-only models.
5 related articles
Model Distillation
モデル蒸留Model distillation is a compression technique that trains a smaller student model using outputs from a larger teacher model to transfer knowledge. It helps reduce latency and cost while keeping quality.
O
P
Pretraining
事前学習Pretraining is a large-scale training stage where a model learns general language patterns and knowledge from massive unlabeled text. It forms the foundation for later adaptation methods.
1 related article
Pay-as-you-go Pricing
従量課金Pay-as-you-go pricing is a billing model where costs increase or decrease with usage, often based on tokens or processing volume for AI services. It is fundamental for estimating AI operating expenses.
6 related articles
Planning
プランニングPlanning is an agent step that decomposes a goal into actionable sub-steps and organizes execution order and required information. It helps reduce omissions and wasted work.
Prompt
プロンプトA prompt is the input text that communicates your goal and constraints to the model and strongly influences output quality. Understanding prompts helps you control results more reliably.
58 related articles
Prompt Injection
プロンプトインジェクションPrompt injection is an attack that uses malicious instructions to override system rules and trigger unintended actions or information leakage. It is a major risk for tool-using or retrieval-augmented systems.
1 related article
Prompt Template
プロンプトテンプレートA prompt template is a structured prompt pattern with placeholders that can be filled with variables to produce stable inputs and outputs. It is commonly used in production systems to reduce variability.
1 related article
Pricing Tier
料金ティアA pricing tier is a plan category with predefined limits, unit prices, and features, used to choose the right plan for your needs. It is a core concept for cost planning.
Q
R
RLAIF (Reinforcement Learning from AI Feedback)
RLAIFRLAIF is a technique that uses AI-based judgments to generate preference data and then applies reinforcement learning to tune a model. It reduces reliance on human raters but requires careful oversight of evaluator bias.
RLHF (Reinforcement Learning from Human Feedback)
RLHFRLHF is a technique that builds a reward model from human preference judgments and then uses reinforcement learning to align a model's behavior. It is widely used to improve helpfulness and safety.
Reinforcement Learning (RL)
強化学習Reinforcement learning is a training method that learns a policy to maximize rewards obtained from the outcomes of actions. In LLMs, it can be used to steer behavior toward preferred responses.
Refusal
拒否応答A refusal is a model behavior that declines requests it cannot comply with for safety or policy reasons, typically with an explanation. It is important for safe and predictable AI operation.
1 related article
Retrieval-Augmented Generation (RAG)
検索拡張生成Retrieval-augmented generation is a technique that retrieves supporting passages via external search and uses them to generate answers for higher accuracy. It helps reduce hallucinations and incorporate up-to-date or private knowledge.
Reward Model
報酬モデルA reward model is a model that scores the quality of responses numerically and is used as an objective signal in methods like RLHF. It helps translate preferences into an optimization target.
Role Prompting
役割付与Role prompting is a technique that assigns a role or persona (e.g., "You are a reviewer") to encourage answers in a desired tone and perspective. It can improve consistency and relevance.
Retriever
リトリーバーA retriever is a component that searches for documents or chunks relevant to a question and passes them to an LLM. It is a core part of RAG pipelines.
Reflection
リフレクションReflection is a review step where the model or system checks generated results, identifies errors or gaps, and revises the output. It improves reliability by adding a quality-control loop.
Re-ranking
リランキングRe-ranking is a step that reorders retrieved candidate documents using another model or scoring function to keep the most relevant ones. It improves retrieval quality and controls context size.
Rubric Evaluation
ルーブリック評価Rubric evaluation is a method that defines evaluation dimensions and scoring criteria in a rubric table so outputs are judged consistently. It improves alignment between evaluators and supports repeatable measurement.
Rate Limit
レート制限A rate limit is a cap on how many requests or tokens you can send within a time window, set for stability and fairness. It affects throughput planning and practical cost estimation.
1 related article
Red Teaming
レッドチーミングRed teaming is adversarial testing that simulates misuse and failure modes to uncover weaknesses in a model or system before deployment. It is a key practice for improving AI safety.
S
Structured Output
構造化出力Structured output is an output style that forces responses into a predefined structure so downstream systems can process them reliably. It is important for integrating LLMs into applications and workflows.
3 related articles
System Prompt
システムプロンプトA system prompt is a high-priority instruction that sets overall policies and rules for a conversation and typically overrides user messages. It helps maintain consistent behavior and safety constraints.
1 related article
Source Attribution (Citations)
出典引用Source attribution is a mechanism that explicitly shows the documents or passages used as evidence so readers can verify the answer. It is important for trust, auditing, and safer RAG experiences.
Self-Consistency
自己整合性Self-consistency is an inference method that runs multiple generations for the same question and aggregates them (e.g., by majority vote) to reduce random errors. It can improve reliability at the cost of extra compute.
Stop Sequence
ストップシーケンスA stop sequence is a termination condition that stops generation when a specified string appears in the output. It is useful for controlling boundaries in formatted responses.
Streaming Generation
ストリーミング生成Streaming generation is an output mode that returns tokens progressively instead of sending the full response at once. It improves perceived responsiveness and user experience for long outputs.
Safety Classifier
セーフティ分類器A safety classifier is a detection model that determines whether inputs or outputs violate safety criteria and serves as a core part of guardrails. It helps reduce incidents by filtering risky content.
T
Text-to-Speech (TTS)
音声合成Text-to-speech is a technology that generates natural-sounding speech from text, used for read-aloud features and conversational assistants. It enables voice output experiences beyond plain text.
Temperature
温度Temperature is a setting that adjusts the randomness of the next-token probability distribution, changing the balance between creativity and stability. Lower values are more deterministic, higher values more diverse.
Tool Use
ツール利用Tool use is a setup where a model calls external tools (e.g., search or calculators) and incorporates the results to improve answer accuracy. It is common in practical implementations to reduce hallucinations and mistakes.
2 related articles
Transfer Learning
転移学習Transfer learning is the idea of reusing knowledge learned from one dataset or task to perform well on another task with less data. It explains why pretrained models can adapt quickly.
Token
トークンA token is a small unit of text used by models for processing, and it is also used to estimate limits and billing. Understanding tokens is essential for controlling cost and context length.
22 related articles
Token-Based Pricing
トークン課金Token-based pricing is a pay-as-you-go billing model where fees are determined by the number of input and output tokens. Managing both input and output lengths helps reduce cost.
1 related article
Top-p Sampling (Nucleus Sampling)
トップPサンプリングTop-p sampling is a decoding method that keeps the smallest set of high-probability tokens whose cumulative probability reaches p, then samples the next token from that set. It helps control diversity without relying only on temperature.
Transformer
トランスフォーマーA Transformer is a neural network architecture that captures important relationships in text using attention and can be trained efficiently with parallel computation. Knowing its mechanism helps you understand modern LLMs.
Toxicity
有害性Toxicity is the degree to which content is harmful—such as discrimination, violence, or harassment—and is targeted by safety evaluation and filtering. It is a key metric for content moderation and guardrails.
V
Vision-Language Model
視覚言語モデルA vision-language model is a model that jointly understands images and text to describe image content and answer questions about it. Knowing how it works helps you choose the right model for visual tasks.
Vector Search
ベクトル検索Vector search is a retrieval method that uses distances between embeddings to find semantically similar documents or data. It enables meaning-based search beyond keyword matching.
Vector Database
ベクトルデータベースA vector database is a data store optimized to save embeddings and quickly retrieve nearby vectors for similarity search. It is a common building block for RAG and semantic search systems.