Explainer

Glossary

A helpful glossary on how we define and refer to these complex ideas in our work

vocab symbol

Artificial intelligence (AI) 

Computer systems that perform tasks that historically required human judgment, such as recognizing patterns, making predictions, or generating content. 

AI model 

A mathematical system trained on data that can produce outputs (predictions or generated content) from new inputs. 

Algorithm 

A step-by-step computational method. In AI, algorithms are used to train models and to produce outputs from inputs. 

Anomaly detection 

Methods that flag unusual patterns in data (e.g., unusual network traffic or shipping patterns) that may merit investigation. 

Bias 

Systematic errors or differences in performance across groups or contexts, often reflecting patterns and/or gaps in training data or design choices. 

Black box 

A system whose internal reasoning is difficult to inspect or explain, even if its inputs and outputs are visible. 

Classification 

A prediction task where the model assigns an input to a category (e.g., spam/not spam; high risk/low risk; or dog/cat/sheep/snake).   

Context 

All information provided to a language model for a specific response. This usually includes the user message plus additional instructions, conversation history, or retrieved documents. 

Deep learning 

A machine learning approach that learns complex patterns directly from large amounts of raw data by adjusting many internal parameters (“connections”). 

Deepfake 

Synthetic or manipulated media (image, audio, or video) designed to look authentic, often used for deception or influence. 

Developer instructions 

Rules provided by the organization building the system that shape tone, formatting, and policy constraints (e.g., how to handle sensitive content). 

Fine-tuning (supervised) 

Additional training on labeled examples to make a general model more reliable for specific tasks. 

Guardrails 

Controls applied before or after model generation to reduce harmful outputs or misuse (e.g., blocking certain inputs, redacting sensitive strings, enforcing logging). 

Hallucination 

When a language model produces plausible-sounding but factually incorrect or fabricated content. 

Independent learning 

In this primer’s framing, systems (typically deep learning) that learn patterns directly from data rather than relying on explicit rules or handpicked variables. 

Instruction tuning 

A training step where a language model learns to follow instructions by training on many examples of tasks and correct responses. 

Large language model (LLM) 

A deep learning model trained on large text corpora to generate and interpret text, typically by predicting the next token. 

Machine learning (ML) 

A subset of AI where models learn patterns from data rather than being explicitly programmed with fixed rules. 

Model drift 

Performance changes over time because real-world data, behavior, or conditions shift away from what the model was trained on. 

Natural language processing (NLP) 

Methods that enable computers to work with human language (classification, extraction, summarization, translation, generation). 

Next-token prediction 

The core training objective for many language models: predict the most likely next token given preceding text. 

Output constraints 

Formatting or policy requirements placed on model outputs (e.g., “respond in bullet points,” “include citations,” “use this template”). 

Pattern matching (classical machine learning) 

In this primer’s framing, statistical models that learn relationships between selected input variables and outcomes to make predictions. 

Prediction 

An estimate produced by a model about an unknown label or future outcome (e.g., risk score, likely delay, likely category). 

Preference training/RLHF 

A training approach where the model learns to produce answers preferred by human evaluators, often using reinforcement learning methods. 

Prompt 

The user’s input text to the system (often combined with additional context and instructions before reaching the model). 

Reinforcement learning (RL) 

A learning approach where a system learns strategies by trial and error using feedback (“rewards”) from an environment or evaluator. 

Retrieval-Augmented Generation (RAG) 

A system design where the model is provided with retrieved reference material (documents, databases, web sources) to ground its answers. 

Rule-based / symbolic system 

Software that follows explicit human-written rules (e.g., “if X, then do Y”), often easier to audit but less flexible with ambiguity. 

Supervised learning 

Training a model on labeled examples where the correct answer is known (e.g., approved/rejected; high risk/low risk). 

Token 

A unit of text used by language models (may be a whole word, part of a word, punctuation, or whitespace). 

Tool use 

A system setup where a language model can call external tools (search, databases, calculators) and incorporate the results into its response. 

Transparency 

Operational clarity about what an AI system does, what data it uses, its limitations, and who is accountable for decisions influenced by it. 

Unsupervised learning 

Training or analysis on unlabeled data to find structure (e.g., clustering themes in documents, detecting unusual patterns). 

Validation (human review) 

A workflow where a person checks AI outputs before they are relied upon, especially in high-stakes contexts.