Large Language Models

System Prompt

The instructions sent before the user's message that define the model's role, tone and constraints.

In common use since 2022

A system prompt is the block of instructions sent to an LLM before any user input. It establishes the model's role, persona, tone, output format, constraints, available tools and any non-negotiable rules. Most chat APIs treat it as a separate role — "system" rather than "user" or "assistant" — and the model is trained to weight it more heavily than ordinary user messages.

A well-written system prompt is the single highest-leverage piece of an LLM application. Almost every "the model isn't doing what I want" complaint resolves to a vague or contradictory system prompt. A good one does several things:

  • Defines the role clearly — "You are a senior tax accountant for US small businesses" beats "You are a helpful assistant".
  • Sets the output format — JSON schema, headers, length limits, what to do when uncertain.
  • Lists hard constraints — what to refuse, what to escalate, what to never include.
  • Provides reference data — shared style guides, term definitions, anything the model needs every turn.
  • Includes examples — few-shot demonstrations of ideal behaviour.

For a US production app in 2026, the prompt engineering loop looks like this: write a draft, run it against a held-out evaluation set, collect failure cases, refine the prompt, repeat. Treat the system prompt as code: version it, test it on regressions when you change it, and monitor performance over time as the underlying model upgrades.

Prompt caching has reshaped the economics. Frontier APIs now reuse the encoded prefix of long system prompts at 10–25% of the original cost. Engineering your prompts so the static, cacheable content sits at the top — and the dynamic, per-request content sits at the bottom — can cut your bill by 90% on high-volume workloads.

Three failure modes to watch:

  • Instruction overload — fifty bullet points of rules and the model starts ignoring most of them. Prioritise.
  • Conflicting instructions — "Always be concise" and "Always show your reasoning step by step" pull in opposite directions; pick one.
  • Personality leakage — overly forced personas ("You are a snarky pirate") can degrade reasoning quality on serious tasks.

The mature pattern is a short, focused system prompt; few-shot examples for format; and a rigorous eval harness that catches regressions before users do.

Keep exploring

Looking for something else? The full glossary covers 120+ AI terms updated for 2026.

Open the glossary
Chat on WhatsApp