PyTorch From Zero
PyTorch is the framework most modern AI is actually built in — the research papers, the image models, and the large language models you've heard of were, in huge part, trained with it. It has a reputation for being deep and mathematical, and the math is real, but the framework itself rests on just three ideas: a tensor (a multi-dimensional array you do math on, fast, on a GPU), autograd (PyTorch automatically computes the derivatives needed to learn), and the training loop (a short, repeating ritual that nudges a model toward being right). Understand those three and the rest is detail.
This guide builds those ideas first, in plain language, then assembles them into a real model you train and run. We connect it to what you may already know: a tensor is NumPy's array with superpowers; "learning" is the gradient descent from How a Model Learns, made concrete; a model is a Python class. By the end you'll have trained a working classifier and understand every line of the loop that did it.
📝 This teaches the framework, not the math from scratch. It assumes Python (Python From Zero) and is far richer if you've met the concepts in What AI & ML Are and especially How a Model Learns (gradients, loss, training). Helpful too: pandas From Zero for data prep.
⚠️ PyTorch needs a native install (and ideally a GPU), so examples here are shown with their output rather than run on the page — follow along in a notebook or Google Colab (free GPUs).
How to read this
Read in order — it builds from a single tensor up to a trained, saved classifier, one idea per phase. Phases carry difficulty badges; the 🔴 ones (autograd, the loop, performance) are the conceptual core.
The phases
Part 1 — The three core ideas (🟢 → 🔴)
- What PyTorch Is & Tensors 🟢 — the tensor: a GPU-ready, autograd-aware array.
- Tensor Operations & the GPU 🟢 — math, broadcasting, reshaping, and moving work to the GPU.
- Autograd: Automatic Differentiation 🔴 — how PyTorch computes the gradients that make learning possible.
Part 2 — Building & training a model (🟡 → 🔴)
4. Building Models with nn.Module 🟡 — layers, forward(), and a model as a Python class.
5. Loss Functions & Optimizers 🟡 — measuring wrongness and the algorithm that fixes it.
6. The Training Loop 🔴 — forward → loss → backward → step: the ritual that trains everything.
7. Data: Dataset & DataLoader 🟡 — batching, shuffling, and feeding data efficiently.
8. Training a Real Classifier 🟡 — putting it all together on a real dataset, with evaluation.
Part 3 — Using & shipping models (🟡 → 🟢)
9. Saving, Loading & Inference 🟡 — state_dict, eval() mode, no_grad, and running a trained model.
10. GPUs, Performance & Common Pitfalls 🔴 — devices, speed, and the bugs that bite every beginner.
11. Where to Go Next 🟢 — pretrained models, transfer learning, the ecosystem, and LLMs.
Tensors, autograd, the loop. Everything in deep learning — from a 3-line model to a giant LLM — is those three ideas at scale. This guide makes them yours.