Fine-Tuning vs Prompting, Honestly

At some point — usually after a demo goes well and someone with a budget gets excited — the question lands on your desk: "Should we fine-tune our own model?" It sounds like the serious, grown-up answer. It sounds like what real AI teams do. And it is, sometimes, exactly the wrong move that costs a quarter and ships nothing.

The honest truth is that most teams reaching for fine-tuning didn't need it. They needed a better prompt, or they needed to feed the model the right documents at request time. Fine-tuning is a real tool with a real job — but it's the most expensive way to steer a model, it locks you in the hardest, and it's the one people reach for first for the wrong reasons.

This guide gives you the mental model to tell the three approaches apart, an honest look at what fine-tuning actually costs, and a decision order you can defend in a meeting. This is the capstone of the AI/ML track — it assumes you've met prompting, RAG, and calling an LLM API in the sibling guides, and pulls them together into one decision.

How to read this

Need to decide right now? Jump to Phase 3: Choosing — the Honest Order and use the decision table at the top.
Want it to finally make sense? Read in order — each phase builds on the last. Phase 1 gives you the three-way mental model, Phase 2 shows what fine-tuning really involves, and Phase 3 turns it into a decision.

The phases

Three Ways to Steer a Model — the mental model: prompting changes the instructions, RAG changes the knowledge, fine-tuning changes the behavior. The distinction that the whole decision rests on.
What Fine-Tuning Actually Involves — the dataset (where the real cost lives), the training run, hosting your tuned model, the lighter LoRA approach, and how you'd know if it worked.
Choosing — the Honest Order — try prompt → RAG → fine-tune, in that order, because each step costs more and locks you in more. A decision table, and the two traps that catch everyone.

Deeper material — building a training pipeline, distillation, RLHF, and serving infrastructure at scale — is deliberately out of scope here. This guide is about the decision, not the implementation. Once you've honestly decided fine-tuning is right, your model provider's tuning docs are your next stop.

Related guides: Prompt Engineering, Honestly · RAG, Explained · Using an LLM API