AI Glossary
Master the language of AI with our comprehensive glossary. Each term includes practical examples to accelerate your understanding.
Last Updated: June 2, 2025
Temperature
Controls the randomness of model outputs. Lower values are more predictable while higher values encourage creativity.
Top-k sampling
When generating text, limits next-token choices to the k most likely options, cutting off the long tail of improbable tokens.
Top-p (nucleus) sampling
Chooses from the smallest set of tokens whose cumulative probability exceeds p, balancing coherence with diversity.
Beam search
A decoding strategy that keeps multiple hypotheses at each step and selects the sequence with the highest overall probability.
Chain-of-thought
A prompting technique that encourages a model to reason step by step before producing a final answer.
Self-consistency
A decoding method that samples multiple reasoning paths and chooses the answer that appears most frequently among them.
Logit bias
Direct manipulation of token probabilities to encourage or discourage specific outputs.
Token
A basic unit of text processed by language models, which can represent characters, words, or sub‑words.
Prompt engineering
Crafting input prompts in a way that elicits the desired behaviour from a model.
RLHF
Reinforcement Learning from Human Feedback – fine‑tuning a model with preference data collected from people.
Fine-tuning
Adapting a pretrained model on a smaller dataset so it excels at a specific task.
Context window
The maximum number of tokens a model can consider in a single prompt.
Zero-shot
Providing a task with no examples and expecting the model to perform based solely on instructions.
Few-shot
Including a handful of examples in the prompt to show the desired style or format.
Embeddings
Numeric vector representations of text that capture semantic meaning for tasks like search or clustering.
Gradient descent
Optimization algorithm that adjusts model weights to minimize training loss.
LoRA
Low‑Rank Adaptation; a parameter‑efficient fine‑tuning technique that inserts small trainable matrices into a frozen model.
Knowledge cutoff
The most recent date of information included in a model's training data.
Parameter
An individual weight value within a neural network; large language models contain billions of them.
Retrieval-Augmented Generation (RAG)
A technique that combines external information retrieval with generation, giving models fresh, task-specific context.
Hallucination
When a model confidently produces information that is plausible-sounding but factually incorrect or nonexistent.
System prompt
A hidden instruction that sets the overarching behaviour, tone, or constraints for a chat model.
Prompt injection
A technique or attack where user input manipulates a prompt to override or leak hidden instructions.
Alignment
The extent to which an AI system's behaviour matches human intentions, ethical standards, and societal values.
Inference
The phase where a trained model generates outputs from inputs, as opposed to being trained or fine-tuned.
Attention
A mechanism in transformer models that lets them weigh the relevance of different tokens when generating each output.
Transformer
A neural network architecture based on self-attention that enables parallel processing, powering most modern language models.
Reranking
Reordering a list of candidate outputs (e.g., retrieved passages) using a specialised scoring model to surface the best ones.
Model Context Protocol (MCP)
A structured specification for packaging context blocks—such as system prompts, memory, retrieved documents, and user messages—before sending them to a language model, enabling reproducible, streamed interactions.