Back to analytics

Longitudinal causal inference

When treatment unfolds over time

A lot of real questions are not single decisions. A patient is seen, treated, returns, seen again, treated again. By visit two, what we observed is partly the consequence of visit one's treatment, and the same variable is at once a confounder for the next decision and a mediator of the previous one. Cross-sectional tools quietly miss this.

The structural shift

Same outcome, different graph. Conditioning on the time-varying covariate is not optional in the longitudinal world, and adjusting for it the wrong way blocks the very effect you're trying to measure.

Cross-sectional

One decision, one outcome.

WAY

Longitudinal

L₂ is post-treatment for A₁ and pre-treatment for A₂.

L1A1L2A2Y

Beyond a single treatment

The estimand grows with the decision sequence. Click each card to see the math.

Cross-sectional ATEclick to reveal

One treatment, one counterfactual contrast.

click to hide
Longitudinal: static regimeclick to reveal

Apply the same plan at every visit, then contrast across plans.

click to hide
Longitudinal: dynamic regimeclick to reveal

The next treatment depends on what's been seen so far. The history travels with you.

click to hide

A sandbox to poke at

A two-stage longitudinal DGP, intentionally gentle. The propensity barely depends on the covariates, so the naive (regime-matching) mean is close to the truth here. With stronger time-varying confounding the gap widens, sometimes flipping the sign. The point of the sandbox is the workflow, not the bias.

LTMLE doodle, live

A two-stage longitudinal DGP with time-varying confounding. Pick a regime, redraw the sample, and watch the bars settle around the oracle truth.

Sample size n:seed = 123
Oracle truth0.644
Counterfactual sample mean0.642
Naive (regime-matched mean)0.679
LTMLE (snapshot)0.661 [0.608, 0.715]

Oracle and naive recompute on each redraw. The LTMLE point estimate is a snapshot from the kit's RMD output (SL.glm, seed=123, n=1,000); download the kit to run LTMLE on your own data.

Take the kit

The same code that produced the snapshot above. Drop in your own dataset and re-knit.

Download kit (zip)
  • ltmle-doodle.csv

    Synthetic dataset, 1000 rows, columns L1, A1, L2, A2, Y

  • ltmle-doodle-dictionary.csv

    Six-column data dictionary

  • ltmle-doodle.Rmd

    Runnable Rmd: naive contrasts, LTMLE for three regimes, oracle truth

  • ltmle-doodle.html

    Pre-knitted HTML preview (embedded below)

ltmle-doodle.html (pre-knitted preview)Open in new tab

Coming next

Worked example, on a real question

We're going to swap the toy DGP out for a use case we're actively working on, with real time-varying confounding and a clean naive-vs-LTMLE contrast. This space is held for it.