Section 1 — Data loading and filtering
Load Excel workbook
Final_Data_for_Modeling.xlsx · 3 sheets
pop_estimate_components
Annual: YEAR, POP_ESTIMATE, BIRTHS, DEATHS,
INTERNATIONAL_MIG, DOMESTIC_MIG, NET_MIG, RESIDUAL
Coverage: 2000–2025
pop_by_agesex
Age × sex panel: YEAR (code 1–6), AGE (0–85),
TOT_POP, TOT_MALE, TOT_FEMALE
Coverage: 2019–2024 (codes → calendar years)
mortality_rate_2020_census
National 2020 Census: age, total resident population,
births, deaths by single year of age
Used for Gompertz mortality estimation
Filter and align
CBSA = 12060 (Atlanta MSA only) · YEAR codes mapped: code + 2018 → calendar year
Panel indexed by (YEAR, AGE) — 86 age groups × 5 years = 430 rows
Section 2 — Three parallel estimation pipelines (run independently, merged before projection)
A — Fertility
Define age-group fertility weights wa
Ages 15–19: w = 0.025 (low, teen fertility)
Ages 20–24: w = 0.200
Ages 25–29: w = 0.300 (peak fertility)
Ages 30–34: w = 0.250
Ages 35–39: w = 0.150
Ages 40–44: w = 0.050
Ages 45–49: w = 0.025 (decline)
Normalise to probabilities pa
pa = wa / Σwa
Property: Σpa = 1
Expand 5-yr group probabilities to single-year ages within [15, 49]
(each age in a group inherits the group's pa)
Merge with female population Fa,t
Use Fa,2020 from pop_by_agesex (YEAR=2020)
Restrict to ages a ∈ [15, 49]
Weighted fertile population:
W = Σa (Fa · pa)
Higher-fertility ages (25–34) contribute more to W
Calibrate scaling factor k
k = Bobs / W
Bobs = observed births from pop_estimate_components
Metro estimate: k ≈ 0.325
National proxy check: k ≈ 0.334
Close agreement validates metro ≈ national fertility
Birth formula (used each projection year)
Bt = k · Σa∈[15,49] (Fa,t−1 · pa)
Bt assigned to P0,t (age 0 entry)
Fa,t−1 from previous year — births lag female pop by 1 yr
B — Mortality (Gompertz)
Compute observed death rates da
da = Deathsa / Popa
Source: 2020 Census national mortality table
Covers ages 0–75 from observed data
Metro mortality ≈ national mortality (reasonable for large MSA)
Log-transform (Gompertz linearisation)
Empirical mortality rises exponentially with age:
mx ≈ A·eβx (Gompertz law)
Taking log linearises:
log(da) = α + β·a
Allows OLS regression on log scale
OLS fit on ages 50–75
Fit: log(mx) = α̂ + β̂·x
Fitted parameters: β̂ = 0.0621, α̂ = −5.584
Interpretation: mortality doubles every
ln(2)/β̂ ≈ 11 years of age
Smoothed rate: m̂x = e(α̂ + β̂x)
Extrapolate ages 76–84 and 85+
Ages 76–84: predict via fitted Gompertz model
m̂x = e(α̂ + β̂x) for x = 76, 77, …, 84
Age 85+ (open-ended group):
Use representative age = 88 in the model
m̂85+ = e(α̂ + β̂·88)
Incremental cohort mortality m(x)
m(0) = m̂0
m(x) = m̂x − m̂x−1 for x ≥ 1
Captures marginal mortality risk increase from one age to the next
Used in cohort transition: Dx,t = m(x) · Px,t
Validation passed
Σx m(x)·Px,t ≈ observed total deaths
Years 2020–2024 all within acceptable range
Confirms Gompertz extrapolation is valid for this MSA
C — Migration
Compute annual migration differential
DIFFt = NET_MIGt + RESIDUALt
NET_MIG = domestic + international migration
RESIDUAL = unexplained pop change component
Historical range: ~22K (2011) to ~120K (2006)
Covers 2000–2024 (26 years)
Age-specific population weights wx,t
Total population per year:
Nt = Σx Px,t
Age weight:
wx,t = Px,t / Nt
Age-weighted migration:
Mx,t = wx,t · DIFFt
Distributes total migration proportionally by age share
Per-age mean and std (2000–2024)
M̄x = E[Mx,t] across all historical years
σx = std[Mx,t] across all historical years
M̄x used as baseline in projection
σx used to construct uncertainty scenarios
Future migration assumed to revert to historical mean
Female ratio for sex split
From 2021–2024 data, ages 15–49:
Female ratio = Σ TOT_FEMALE / Σ TOT_POP
≈ 0.511 (stable across years)
Applied each projection year:
Ft = 0.511 · Pt
Mt = Pt − Ft
Age-85+ geometric growth rate g85
Open-ended group cannot be modelled by cohort shift
Geometric trend from observed data:
g85 = (P85,2024/P85,2020)1/4 − 1
Applied each year:
P85,t = P85,t−1 · (1 + g85)
Merge all three pipelines into projection panel
Panel: (YEAR, AGE) → TOT_POP, TOT_FEMALE, mortality_rate, fertility_prop, avg_ageweighted_diff, std_ageweighted_diff
Historical years 2020–2024 populated · Future years 2025–2035 initialised as NaN, filled by loop
Section 3 — Annual projection loop
for t = 2025, 2026, … 2035 (iterate over all ages x = 0, 1, … 85)
Uses output of year t−1 as input · 5 steps per year
Step 1 — Births (age-0 cohort entry)
Bt = k · Σa∈[15,49] (Fa,t−1 · pa)
Set P0,t = Bt · female ratio determines F0,t and M0,t
Step 2 — Cohort aging (ages x = 1 to 84)
Px,t = Px−1,t−1 − m(x−1)·Px−1,t−1 + M̄x−1
Each cohort shifts forward one age · Deaths subtracted using incremental mortality m(x) · Mean migration added
Step 3 — Age 85+ (open-ended group)
P85,t = P85,t−1 · (1 + g85)
Geometric extrapolation — open interval cannot receive a cohort shift from age 84
Step 4 — Sex split
Fx,t = 0.511 · Px,t Mx,t = Px,t − Fx,t
Applied uniformly across all ages · Ratio validated as stable (2021–2024)
Step 5 — Record deaths for this year
Dx,t = m(x) · Px,t
Stored for output and validation · Not fed back into next iteration (deaths already subtracted in Step 2)
↺ t = t + 1 → return to loop start if t ≤ 2035
Output of year t becomes input Px,t−1 for year t+1
Section 4 — Uncertainty scenarios (migration is the dominant source of variance)
Why migration drives uncertainty
Fertility and mortality patterns are relatively stable year-to-year
Migration fluctuated from ~22K (2011 recession trough) to ~120K (2006 boom) — a 5× swing
σx from the historical distribution is used to construct plausible future bounds
Very low
M̄x − 2σx
Severe sustained out-migration scenario
2035 population: ~6.69M
Low
M̄x − σx
Below-average migration
2035 population: ~6.88M
Base
M̄x
Historical average migration
2035 population: ~7.07M
High
M̄x + σx
Above-average migration
2035 population: ~7.26M
Very high
M̄x + 2σx
Sustained high in-migration
2035 population: ~7.45M
Section 5 — Outputs
Population totals 2025–2035
Annual Σx Px,t for base and all scenarios
Base: 6.49M (2025) → 7.07M (2035)
Decade increase ≈ 580,000 (base scenario)
Saved to population_forecast.xlsx
Forecast fan chart
Historical Census (2000–2024) + base forecast
±1σ shaded band (dark)
±2σ shaded band (light)
Bands widen over time — compounding migration variance
Log mortality chart
Observed log(da) scatter vs age
Fitted Gompertz line: y = 0.0621x − 5.584
Extrapolated segment beyond age 75 shown
Validates β̂ = 0.0621 (doubles every ≈11 yrs)
Key assumptions summary
· Fertility: metro ≈ national pattern (validated: kmetro ≈ knational)
· Mortality: metro ≈ national; Gompertz extrapolation for ages 76+ (validated vs. observed deaths)
· Migration: reverts to 2000–2024 historical mean; uncertainty from σ across that period
· Female ratio: stable at 0.511 through projection horizon
· Age 85+: geometric trend g85 from 2020–2024; no cohort-specific mortality modelled
10-year forecast complete — 2025–2035