Glossary

Context Window

A context window is the amount of text an AI language model can consider at once, the working memory it uses to read input and generate a response, measured in tokens.

Reviewed by Daniel Hayes, Revenue Operations
Last updated

Key takeaways

  • A context window is the maximum span of text (input plus output) a model can consider at once, in tokens.
  • Everything the model reasons over, prompt, history, documents, must fit inside the window.
  • When a conversation exceeds the window, the oldest content is dropped, so the model 'forgets'.
  • It is the mechanism behind a model's context awareness, and bounds it.
  • Bigger is not always better, irrelevant content in the window can dilute focus and degrade responses.

A context window is the amount of text an AI language model can consider at once, the working memory it uses to read your input and generate a response. Measured in tokens (roughly, pieces of words), the context window sets a hard limit on how much information, conversation history, documents, instructions, the model can take into account in a single interaction.

The context window is one of the most consequential properties of a language model, because it bounds what the model can "see." Everything the model reasons over, your prompt, the prior conversation, any documents you provide, must fit inside it. When it does not, something gets left out, and the model effectively forgets.

What a context window is

The context window is the maximum span of text (input plus output) a model can process in one go, expressed as a token count. A token is a chunk of text, often a word or part of a word; a few thousand tokens is several pages, and modern models range from thousands to hundreds of thousands (or more) of tokens. Whatever the model is asked to work with must fit within this window; anything beyond it is invisible to the model for that response.

Input must fit the context window for the model to consider it.

Tokens and window size

ConceptMeaning
TokenA chunk of text (≈¾ of a word on average)
Context windowMax tokens the model can consider at once
InputPrompt, history, documents, instructions
OutputThe model's response (also counts toward the limit)

Why the context window matters

  • It bounds memory. A model can only consider what fits in the window; beyond it, information is lost.
  • It shapes conversations. In a long chat, early messages can fall out of the window and be "forgotten."
  • It limits documents. How much reference material you can feed the model at once is capped by the window.
  • It underpins context awareness. The window is the mechanism behind a model's context awareness.

The context window and "forgetting"

A common experience with AI is a long conversation where the model seems to forget something said earlier. The context window is usually why: once the conversation exceeds the window, the oldest content is dropped to make room, and the model genuinely cannot see it anymore. This is not a flaw in reasoning but a structural limit, and it is why techniques like summarizing earlier context, or retrieving only the relevant parts of a long document, exist, to fit what matters into the available window.

Context window in sales AI

For sales applications, the context window determines how much an AI can take into account about a prospect or deal in one interaction, the conversation so far, the account history, the relevant documents. A larger window lets an AI sales assistant reason over more context (a full email thread, an account's history) at once, producing more relevant, context-aware responses. When context exceeds the window, systems use retrieval to pull in only the most relevant pieces, a practical workaround for the limit.

Common misconceptions about the context window

  • "Bigger is always better." Larger windows help, but stuffing them with irrelevant content can dilute focus and degrade responses.
  • "The model remembers everything." It only remembers what is in the window; beyond it, it forgets.
  • "Output is free." The response also consumes the window, long outputs leave less room for input.
  • "Window = knowledge." The window is working memory for one interaction, not the model's overall training knowledge.

The context window is the working memory of an AI model, the span of text it can consider at once, and it quietly governs what the model can take into account and what it forgets. Understanding it explains both the power and the limits of AI in conversation, and why managing what goes into the window is as important as the window's size.

Frequently asked questions

What is a context window?

A context window is the amount of text an AI language model can consider at once, the working memory it uses to read your input and generate a response, measured in tokens (roughly, pieces of words). It sets a hard limit on how much information, conversation history, documents, instructions, the model can take into account in a single interaction. Whatever the model works with must fit inside it; anything beyond is invisible for that response.

What is a token, and how does it relate to the window?

A token is a chunk of text, often a word or part of a word (about three-quarters of a word on average). The context window is the maximum number of tokens, input plus output, the model can process at once. A few thousand tokens is several pages; modern models range from thousands to hundreds of thousands of tokens. The model's response also counts toward the limit.

Why does an AI 'forget' things in a long conversation?

Usually because of the context window. Once a conversation exceeds the window, the oldest content is dropped to make room, and the model genuinely cannot see it anymore. This is a structural limit, not a reasoning flaw, and it is why techniques like summarizing earlier context or retrieving only the relevant parts of a long document exist, to fit what matters into the available window.

Why does the context window matter for sales AI?

It determines how much an AI can take into account about a prospect or deal in one interaction, the conversation so far, the account history, the relevant documents. A larger window lets an AI sales assistant reason over more context at once, producing more relevant, context-aware responses. When context exceeds the window, systems use retrieval to pull in only the most relevant pieces.

Is a bigger context window always better?

Not necessarily. Larger windows help, but stuffing them with irrelevant content can dilute focus and degrade responses. The window is working memory for one interaction, not the model's overall training knowledge, and the response consumes the window too, so long outputs leave less room for input. Managing what goes into the window is as important as its size.

AI Agent Handoff

An AI agent handoff is the moment an AI agent transfers a conversation or task to a human (or another agent), passing along full context so the next party can pick up seamlessly, the escape hatch that keeps automation helpful rather than a trap.

AI Agent SOP

An AI agent SOP (standard operating procedure) is the documented set of rules, steps, and boundaries that govern how an AI agent should handle a given situation, the playbook defining what it does, in what order, and when to escalate, translating human SOPs into instructions an agent executes consistently.

AI Chat Agent

An AI chat agent is an AI system that converses with people through text chat, on a website, in an app, or in messaging, understanding what they type and responding helpfully, and increasingly taking actions, rather than following a rigid scripted menu.

AI Concierge

An AI concierge is an AI assistant that provides personalized, white-glove help to customers or prospects, guiding them, answering questions, and handling requests in a high-touch, attentive way, available instantly and at scale.

AI Copilot

An AI copilot is an AI assistant that works alongside a human, suggesting, drafting, and surfacing information in real time while the person stays in control and makes the final call. The human is the pilot; the AI assists, never acting alone.

AI Gateway

An AI gateway is a management layer that sits between an application and the AI models it uses, routing requests, enforcing policy, controlling cost, and adding security and observability, much as an API gateway does for APIs.