Context Window
Definition
The context window is the limit on how many tokens a model can handle in a single prompt or conversation. It includes both input and output. If your prompt plus the model’s response exceed this limit, information is cut off or ignored.
Example
“GPT-4 with a 32k context window can understand and respond to around 25,000 words of input and output combined.”
How It’s Used in AI
The context window shapes what the model remembers and responds to. A short window may forget earlier parts of a conversation, while a longer one enables analysis of full documents, semantic search, or long-form content generation.
Brief History
Early GPT models had context windows of just 2,048 tokens. GPT-4 now supports up to 128,000 tokens in some versions, dramatically improving use cases like RAG and document QA.
Key Tools or Models
GPT-4-128k – One of the longest context windows available
Claude 2+ – Known for extended context capabilities
Token counters – Help developers plan prompts within limits
Pro Tip
Context isn’t memory. Once a conversation goes past the window, earlier messages are forgotten unless saved and re-injected manually.