All topics / pandas From Zero

pandas From Zero

Learn the Python data-analysis workhorse: the DataFrame mental model, loading and inspecting data, selecting and filtering, cleaning messy data, transforming with vectorized operations, the split-apply-combine power of groupby, joining datasets, time series, reshaping and pivoting, and plotting. The tool every Python data person reaches for, explained idea-first.

  1. What pandas Is & the DataFrame The mental model that makes pandas click: a DataFrame is a spreadsheet or SQL table living in memory that you manipulate with code. Meet the Series, the DataFrame, the index, and the column-first habit.
  2. Loading & Inspecting Data How real analysis starts: read_csv (and read_excel/json/sql/parquet) to load data, then head/info/describe/value_counts to know exactly what you've got before you trust a single number.
  3. Selecting & Filtering Pull out the columns and rows you actually want: single vs double brackets, loc vs iloc, boolean masks, combining conditions safely, and query().
  4. Cleaning Data Real data is messy. Find and handle missing values with isna/dropna/fillna, fix wrong types with astype and to_numeric/to_datetime, kill duplicates, rename columns, and scrub strings with the .str accessor.
  5. Transforming Data Derive new columns the pandas way: vectorized arithmetic, np.where and pd.cut for conditionals, map and replace for translation, apply as the flexible escape hatch — and why looping over rows is the #1 performance mistake.
  6. GroupBy & Aggregation The split-apply-combine pattern behind almost every analysis: group rows by a key, apply a function to each group, combine the results. Basic groupby, multiple keys, named agg, and transform vs aggregate.
  7. Joining & Combining Stitch separate tables together: merge is the SQL join in DataFrame clothing (inner/left/right/outer), the key traps that explode your row count, and concat for stacking more of the same data.
  8. Time Series & Dates Turn date strings into real datetimes with to_datetime, pull date parts with the .dt accessor, slice by partial dates via a DatetimeIndex, and roll daily data up to any period with resample — groupby over time.
  9. Reshaping & Pivoting The same data wears two shapes — long (one row per observation) and wide (categories as columns). Reshape freely with pivot_table, melt, stack/unstack, and crosstab.
  10. Plotting & Where to Go Next Make quick charts straight off a DataFrame, write your results back out to any format, learn the vectorization and scaling limits honestly, and pick a real project to carry all the way home.