Large Language Models

Chain-of-Thought (CoT)

Asking the model to reason step by step out loud before giving its final answer.

In common use since 2022

Chain-of-thought (CoT) prompting is the technique of asking an LLM to reason step by step before giving its final answer. The simple addition of "Let's think step by step" or "Show your reasoning" to a prompt unlocks dramatic improvements on multi-step tasks: math word problems, logical reasoning, coding tasks and complex instructions. Google's 2022 paper that named the technique showed accuracy jumps of 20–50 percentage points on math benchmarks.

The intuition is that LLMs generate one token at a time, conditioning each new token on everything before it. When the model writes its reasoning out, those reasoning tokens become part of the context informing the final answer. Without CoT, the model has to "compute" the answer in a single forward pass; with CoT, it gets to use the entire generated text as a scratchpad.

In 2026, CoT has evolved into several distinct patterns:

  • Standard CoT — "Think step by step" appended to the prompt. Simple, almost always helpful.
  • Few-shot CoT — examples in the prompt that show the model how to lay out reasoning, especially useful for domain-specific formats.
  • Self-consistency — sample many CoT responses with high temperature, take the majority answer; reduces variance significantly.
  • Tree of Thoughts (ToT) — let the model explore multiple reasoning paths in a tree structure; expensive but powerful for puzzles and planning.
  • Reasoning models — GPT-5 reasoning, Claude Sonnet 4 extended thinking, DeepSeek R1, Gemini 2.5 Thinking. These models are trained to do CoT internally and produce hidden reasoning tokens you may or may not see in the response.

For a US developer the practical takeaway is: when accuracy matters, default to chain-of-thought. The latency and cost penalty (more output tokens) is real but usually worth it for tasks above the level of trivia. For tasks that genuinely are simple, CoT can actually hurt by introducing reasoning noise — measure on your eval set.

The deeper shift in 2026 is that frontier providers now expose CoT as a parameter. Asking GPT-5 with reasoning enabled, or Claude Sonnet 4 with extended thinking, gets you the technique without manually constructing the prompt. The model decides how much to "think" based on task difficulty, and you pay accordingly. Treating reasoning depth as a first-class API parameter is rapidly becoming the new normal.

Keep exploring

Looking for something else? The full glossary covers 120+ AI terms updated for 2026.

Open the glossary
Chat on WhatsApp