Model Selection Rationale — Atlanta MSA 10-Year Population Forecast

Why the cohort-component model was chosen over component extrapolation and ARIMA

Population forecasting objective
Atlanta–Sandy Springs–Roswell, GA MSA · Horizon: 2025–2035
Three candidate models evaluated
Select best approach based on accuracy, interpretability, and demographic validity
Component Extrapolation
Pt+1 = Pt(1 + g)
Method
Convert counts to rates:
bt = Bt/Pt, dt = Dt/Pt, mt = Mt/Pt

g = b − d + mint + mdom + r

σg = √(σ²b + σ²d + σ²m)
Uncertainty: σt = σg·√t
Rolling 10-year window
μr = rolling mean of each rate
σr = rolling std of each rate
Reduces noise, captures recent trend
Result
Stable, smooth projection
CI bands widen as σg·√t
Critical limitation
Single scalar g drives all projections
No age structure — cannot capture:
· Cohort aging dynamics
· Age-specific mortality mx
· Age-specific fertility fa
Verdict: Rejected
Mechanically driven; no demographic meaning
ARIMA(1,1,0) — Log Population
log(Pt) = log(P0) + gt
Method
Log-transform to linearise exponential growth

ADF test (original): p = 0.994 → non-stationary
First difference Y't = Yt − Yt−1
ADF test (differenced): p ≈ 3.7×10⁻⁷ → stationary

Fit AR(1) on differenced log-pop:
Y't = φ·Y't−1 + εt, φ̂ = 0.782
Why ARIMA failed — 4 reasons
1. Sample size: n = 26 observations only
CI = exp(log P̂ ± 1.96·σ·√t) explodes over 10 yrs

2. Stationarity assumption violated:
AR structure assumes linear autocorrelation
2008–12 recession suppressed migration by ~70%;
post-pandemic rebound created structural break

3. Homoskedastic errors assumed:
Migration variance is not constant over time

4. Scalar series — no demographic insight:
Treats population as one time series;
no age, fertility, or mortality information
Verdict: Rejected
Unstable CI; assumptions violated by structural breaks
Cohort-Component Model
Age-by-age, year-by-year projection
Core identity
For x ≥ 1:
Px,t+1 = Px−1,t − Dx−1,t + Mx−1,t

For newborns:
P0,t+1 = Bt

Dx,t = mx·Px,t
Advantages over other models
1. Explicit cohort aging — each age group tracked
2. Age-specific Gompertz mortality: log(mx) = α + βx
3. Age-specific fertility: pa = wa/Σwa
4. Migration uncertainty via σ scenarios: M̄x ± kσx
5. Demographically valid and interpretable
6. Validated against observed deaths 2020–2024
Verdict: Selected
Final model — 2025–2035 forecast
Final selection: cohort-component model
Projects 86 age cohorts (0–85+) annually from 2025 to 2035
Uncertainty captured via 5 migration scenarios: M̄x − 2σ, M̄x − σ, M̄x, M̄x + σ, M̄x + 2σ
Why not extrapolation alone?
Extrapolation gives Pt+1 = Pt(1 + g) — stable but driven purely by one scalar g.
It cannot capture cohort shifts, aging dynamics, or age-varying fertility/mortality.
The cohort model is preferred even though extrapolation looked smooth, because demographic validity matters for planning.
Model / method
Positive outcome
Limitation
Failure reasons
Rejected
Selected / final