Fundamentals

Transfer Learning

Adapting a model trained on one task to a related task with much less data and compute.

In common use since 1995

Transfer learning is the practice of taking a model trained on one task — usually a large, general one — and adapting it to a more specific task with a fraction of the data and compute. It is the reason individual developers can build production-grade AI in 2026: instead of training a model from scratch, you stand on top of one trained by a frontier lab.

The intuition is simple. A model that has seen billions of images has learned generic visual features (edges, textures, object parts) that are useful for almost any vision task. A model that has read trillions of tokens has learned grammar, world knowledge and reasoning patterns useful for almost any language task. Bolt a small task-specific head onto the pretrained body, train on your data, and you inherit all the prior learning.

Transfer learning shows up in several flavours:

  • Feature extraction — freeze the pretrained model entirely and train a small classifier on top of its outputs. Cheapest option, often surprisingly good.
  • Full fine-tuning — unfreeze the whole network and continue training on your data with a small learning rate. Most expensive but highest quality.
  • Parameter-efficient fine-tuning (PEFT) — train only a tiny adapter (LoRA, QLoRA) while leaving the base frozen. Dominant approach for LLMs in 2026 because it is cheap, fast and easy to swap.
  • Prompt-based transfer — no training at all; just describe the task in the prompt and let the model generalise. Works astonishingly well for capable LLMs.

For a US business team adopting AI, transfer learning is what makes the unit economics work. Fine-tuning a 7B-parameter Llama on 10,000 of your support tickets costs roughly a few hundred dollars on rented GPUs; training that model from scratch would cost millions. The performance gap on your specific task — once you have curated good data — is often within a few percentage points of a frontier model on a fraction of the inference cost.

The risks worth flagging: domain shift can be subtle (a model fine-tuned on one customer's data may not transfer to another), legal exposure when fine-tuning on copyrighted or PII-laden data is real, and a model that was perfectly tuned six months ago may need refreshing as the underlying base model evolves.

Keep exploring

Looking for something else? The full glossary covers 120+ AI terms updated for 2026.

Open the glossary
Chat on WhatsApp