Updated Jun 22, 2026

PyTorch From Zero

PyTorch is the framework most modern AI is actually built in — the research papers, the image models, and the large language models you've heard of were, in huge part, trained with it. It has a reputation for being deep and mathematical, and the math is real, but the framework itself rests on just three ideas: a tensor (a multi-dimensional array you do math on, fast, on a GPU), autograd (PyTorch automatically computes the derivatives needed to learn), and the training loop (a short, repeating ritual that nudges a model toward being right). Understand those three and the rest is detail.

This guide builds those ideas first, in plain language, then assembles them into a real model you train and run. We connect it to what you may already know: a tensor is NumPy's array with superpowers; "learning" is the gradient descent from How a Model Learns, made concrete; a model is a Python class. By the end you'll have trained a working classifier and understand every line of the loop that did it.

📝 This teaches the framework, not the math from scratch. It assumes Python (Python From Zero) and is far richer if you've met the concepts in What AI & ML Are and especially How a Model Learns (gradients, loss, training). Helpful too: pandas From Zero for data prep.

⚠️ PyTorch needs a native install (and ideally a GPU), so examples here are shown with their output rather than run on the page — follow along in a notebook or Google Colab (free GPUs).

How to read this

Read in order — it builds from a single tensor up to a trained, saved classifier, one idea per phase. Phases carry difficulty badges; the 🔴 ones (autograd, the loop, performance) are the conceptual core.

The phases

Part 1 — The three core ideas (🟢 → 🔴)

  1. What PyTorch Is & Tensors 🟢 — the tensor: a GPU-ready, autograd-aware array.
  2. Tensor Operations & the GPU 🟢 — math, broadcasting, reshaping, and moving work to the GPU.
  3. Autograd: Automatic Differentiation 🔴 — how PyTorch computes the gradients that make learning possible.

Part 2 — Building & training a model (🟡 → 🔴) 4. Building Models with nn.Module 🟡 — layers, forward(), and a model as a Python class. 5. Loss Functions & Optimizers 🟡 — measuring wrongness and the algorithm that fixes it. 6. The Training Loop 🔴 — forward → loss → backward → step: the ritual that trains everything. 7. Data: Dataset & DataLoader 🟡 — batching, shuffling, and feeding data efficiently. 8. Training a Real Classifier 🟡 — putting it all together on a real dataset, with evaluation.

Part 3 — Using & shipping models (🟡 → 🟢) 9. Saving, Loading & Inference 🟡 — state_dict, eval() mode, no_grad, and running a trained model. 10. GPUs, Performance & Common Pitfalls 🔴 — devices, speed, and the bugs that bite every beginner. 11. Where to Go Next 🟢 — pretrained models, transfer learning, the ecosystem, and LLMs.

Tensors, autograd, the loop. Everything in deep learning — from a 3-line model to a giant LLM — is those three ideas at scale. This guide makes them yours.


Phase 1: What PyTorch Is & Tensors →