Optimization & Sampling

This week shifts focus from what the model is to how you compute with it. We start with linear models — a class far broader than "straight lines" — that can be solved exactly with linear algebra. Then we address the general case: optimisation algorithms that find the best-fit parameters, and Markov chain Monte Carlo methods that map out the full posterior distribution.

Linear Models

A "linear model" does not mean a straight line. It means the model is linear in its parameters — and that single condition unlocks exact solutions, error propagation, and fast computation. We cover polynomial, trigonometric, and radial basis function expansions, regularisation as a Bayesian prior, convexity, and what happens when you have more parameters than data.

Optimisation

Given an objective function — a log-likelihood, a log-posterior, a chi-squared — how do you find the parameter values that extremise it? We survey gradient descent, conjugate gradients, Newton and quasi-Newton methods, and the practical curses (dimensionality, scaling, non-convexity, initialisation) that make optimisation harder than it sounds.

Markov Chain Monte Carlo

A point estimate is not enough: you need the full posterior distribution. MCMC draws samples from the posterior by constructing a random walk that, in the long run, visits each region of parameter space in proportion to its probability. We cover the Metropolis algorithm, affine-invariant ensemble sampling with emcee, Hamiltonian Monte Carlo, and the diagnostics you need to trust your chains.

Last updated on 2 March, 2026