L0 regularisation for subnational microsimulation calibration

L0 regularisation for subnational microsimulation calibration

Maria Juaristi  ( PolicyEngine )  —  “L0 regularisation for subnational microsimulation calibration”  (joint work with: Nikhil Woodruff, Ben Ogorek, Max Ghenis)
July 1, 2026, 5:30 pm Room B (1200) 2B Cross-border Microsimulation
Conference presentation

Tax-benefit microsimulation models typically operate at the national level, using household survey weights calibrated to aggregate population targets. Subnational analysis—at the level of states, congressional districts, or local authorities—requires datasets that simultaneously satisfy geographic distributional constraints while preserving household-level detail. We present a method based on L0 regularisation that jointly optimises survey weight magnitudes and sparsity to produce calibrated subnational microsimulation datasets.

Our approach builds on the Hard Concrete distribution (Louizos, Welling, and Kingma, 2017), which induces exact sparsity by multiplying each households weight by a learned stochastic gate that collapses to a deterministic zero or one at inference time. We parameterise each gate with a log-alpha and temperature parameter, and jointly optimise these alongside log-transformed weight magnitudes using a single loss function combining scale-invariant relative calibration error, an L0 sparsity penalty on the expected count of active households, and a light L2 regulariser on weight magnitudes.

The pipeline begins with the US Current Population Survey. Each household record is cloned multiple times and assigned to random census blocks drawn from a population-weighted distribution. Program participation indicators (SNAP, Medicaid, SSI, TANF, etc.) are re-randomised per geographic assignment using local takeup rates. Each clone is then run through PolicyEngines tax-benefit microsimulation engine to generate geography-specific outputs. The L0 optimiser selects which household-geography combinations to retain, calibrating simultaneously against approximately 37,800 targets across three geographic levels. At the congressional district level, targets include IRS Statistics of Income data (AGI, income tax, EITC by number of children, capital gains, self-employment income, pension income, and other income and deduction categories), Census ACS age distributions (18 age bands), and SNAP household counts. At the state level, additional targets include USDA administrative SNAP spending, CMS Medicaid enrollment, and Census state income tax collections. National targets include CBO budget projections, JCT tax expenditure estimates, SSA benefit totals by program, and other administrative totals.

The sparsity penalty is configurable: a higher penalty produces a compact national dataset (approximately 50,000 records), while a lower penalty yields a larger dataset (approximately 3–4 million records) covering all 436 congressional districts and 50 states individually. Target groups are normalised so that each domain (national, state, congressional district) contributes equally to the loss, preventing large-count groups from dominating.

The method is implemented as the open-source l0-python PyTorch package. We discuss the approach in the context of PolicyEngines US microsimulation model (policyengine.org/us/model), which uses these calibrated datasets for subnational policy analysis, and compare it to alternative calibration approaches including iterative proportional fitting and generalised regression (GREG) estimators.