Glossary
A helpful glossary on how we define and refer to these complex ideas in our work
A helpful glossary on how we define and refer to these complex ideas in our work
Computer systems that perform tasks that historically required human judgment, such as recognizing patterns, making predictions, or generating content.
A mathematical system trained on data that can produce outputs (predictions or generated content) from new inputs.
A step-by-step computational method. In AI, algorithms are used to train models and to produce outputs from inputs.
Methods that flag unusual patterns in data (e.g., unusual network traffic or shipping patterns) that may merit investigation.
Systematic errors or differences in performance across groups or contexts, often reflecting patterns and/or gaps in training data or design choices.
A system whose internal reasoning is difficult to inspect or explain, even if its inputs and outputs are visible.
A prediction task where the model assigns an input to a category (e.g., spam/not spam; high risk/low risk; or dog/cat/sheep/snake).
All information provided to a language model for a specific response. This usually includes the user message plus additional instructions, conversation history, or retrieved documents.
A machine learning approach that learns complex patterns directly from large amounts of raw data by adjusting many internal parameters (“connections”).
Synthetic or manipulated media (image, audio, or video) designed to look authentic, often used for deception or influence.
Rules provided by the organization building the system that shape tone, formatting, and policy constraints (e.g., how to handle sensitive content).
Additional training on labeled examples to make a general model more reliable for specific tasks.
Controls applied before or after model generation to reduce harmful outputs or misuse (e.g., blocking certain inputs, redacting sensitive strings, enforcing logging).
When a language model produces plausible-sounding but factually incorrect or fabricated content.
In this primer’s framing, systems (typically deep learning) that learn patterns directly from data rather than relying on explicit rules or handpicked variables.
A training step where a language model learns to follow instructions by training on many examples of tasks and correct responses.
A deep learning model trained on large text corpora to generate and interpret text, typically by predicting the next token.
A subset of AI where models learn patterns from data rather than being explicitly programmed with fixed rules.
Performance changes over time because real-world data, behavior, or conditions shift away from what the model was trained on.
Methods that enable computers to work with human language (classification, extraction, summarization, translation, generation).
The core training objective for many language models: predict the most likely next token given preceding text.
Formatting or policy requirements placed on model outputs (e.g., “respond in bullet points,” “include citations,” “use this template”).
In this primer’s framing, statistical models that learn relationships between selected input variables and outcomes to make predictions.
An estimate produced by a model about an unknown label or future outcome (e.g., risk score, likely delay, likely category).
A training approach where the model learns to produce answers preferred by human evaluators, often using reinforcement learning methods.
The user’s input text to the system (often combined with additional context and instructions before reaching the model).
A learning approach where a system learns strategies by trial and error using feedback (“rewards”) from an environment or evaluator.
A system design where the model is provided with retrieved reference material (documents, databases, web sources) to ground its answers.
Software that follows explicit human-written rules (e.g., “if X, then do Y”), often easier to audit but less flexible with ambiguity.
Training a model on labeled examples where the correct answer is known (e.g., approved/rejected; high risk/low risk).
A unit of text used by language models (may be a whole word, part of a word, punctuation, or whitespace).
A system setup where a language model can call external tools (search, databases, calculators) and incorporate the results into its response.
Operational clarity about what an AI system does, what data it uses, its limitations, and who is accountable for decisions influenced by it.
Training or analysis on unlabeled data to find structure (e.g., clustering themes in documents, detecting unusual patterns).
A workflow where a person checks AI outputs before they are relied upon, especially in high-stakes contexts.