Synthetic populations are essential for modeling complex systems that require individual-level data. However, they are typically limited to a single country. Creating a synthetic population across multiple countries is challenging because the data available from national statistical institutes are inconsistent: the variables available differ, and for shared variables, the categories may not align. Fortunately, Eurostat provides access to a large amount of aggregated socio-economic data that is consistent across EU countries. Although these data are less detailed than what can be obtained from national statistical institutes, they provide a solid basis for generating synthetic populations.
This work is part of MMUST+, an Interreg project developing a multimodal mobility model for the Luxembourg cross-border area using the synthetic population as input. We propose a multi-stage framework that combines iterative proportional fitting (IPF) (Deming and Stephan 1940) and stochastic synthetic reconstruction (Lenormand and Deffuant 2013) to generate synthetic populations that are statistically and structurally realistic.
The first stage consists of generating several entities: individuals, family nuclei, households, and dwellings. For each entity, we generate attributes covering socio-demographics, employment, education, household structure, dwelling information and spatial location. To generate these entities, we chose the IPF method for its efficiency and low algorithmic complexity. We ran a separate IPF instance for each entity using as many Eurostat marginals as possible. Since no survey includes all variables, we used a uniform seed with some structural zeros, but the multi-variable marginals preserve most dependency information. In the future, if survey data or microdata from statistical institutes become available, these could be used as seed for IPF to improve accuracy. Finally, we applied the truncate-replicate-sample (TRS) (Lovelace and Ballas 2013) integerisation to obtain integer counts.
In the second stage, dwelling attributes are assigned to households by probabilistically drawing dwellings for each household, with probabilities derived from IPF weights. If a dwelling and household are incompatible (different locations or mismatched number of occupants), the probability is set to zero.
The third stage addresses the assignment of individuals to family nuclei and the grouping of isolated individuals and family nuclei into households. We implemented the stochastic sample-free synthetic reconstruction algorithm described in (Lenormand and Deffuant 2013). The probabilities required by this method were computed using age-gap distributions derived from data available from Eurostat and the Human Fertility Database, in order to guide realistic relationships between partners and between parents and children. Hard constraints (e.g. maximum two parents per nucleus) were imposed by setting the corresponding probabilities to zero.
Preliminary validation demonstrates that the population reproduces key aggregate statistics, household structures, and family compositions across the cross-border region. While the current approach relies on aggregated data, future integration of survey microdata from national institutes could further improve accuracy.
Overall, this method offers a practical and flexible approach for generating synthetic populations. By combining IPF-based synthesis, TRS integerisation and stochastic synthetic reconstruction, it produces populations that are consistent with aggregate statistics, household- and individual-level structures.
This study investigates the drivers behind the divergent trends in child poverty between the UK and five European counterparts—Finland, Hungary, Ireland, Poland, and Spain—from 2011 to 2024. These countries were selected to represent a diverse spectrum of trajectories: Finland and Spain serve as references for persistently low and high rates, respectively, while Poland reports a significant reduction over the analysed period of time. The analysis further incorporates Hungary, which follows the UK’s experience of rising child poverty, and Ireland, which offers a divergent path despite sharing initial similarities with the UK. The analysis decomposes these trajectories into three core determinants: demographic differences, labour market dynamics, and tax-benefit systems. To capture differences due to demographic characteristics, we employ a coarsened exact matching and reweighting technique to generate synthetic populations. Differences in labour market dynamics are captured using four-stage statistical models (estimating the probability of employment, the probability of unemployment, hourly wages, and hours worked) based on counterfactual projections for employment and earnings in the UK that reflect labour market conditions in the comparator countries. Finally, the contribution of the tax-benefit system is isolated through a difference-in-differences (DiD) approach, comparing child poverty rates computed on original versus disposable incomes. This is facilitated by utilizing the static microsimulation models UKMOD and EUROMOD that allows to compute disposable incomes. By quantifying these components, we determine the extent to which demographic characteristics, labour market dynamics, or tax-benefit systems have driven the UK’s relative child poverty trends.
Population ageing is rapidly increasing the prevalence of Alzheimer’s disease and related dementias (ADRD) across Europe, creating major challenges for health and long-term care systems. Existing European projections typically rely on static prevalence assumptions or self-reported diagnoses and rarely model individual cognitive trajectories within a unified, forward-looking framework. This paper presents a major extension of the European Future Elderly Model (EU-FEM), introducing a dedicated microsimulation module for dementia and cognitive decline applicable across multiple European countries. Using harmonized longitudinal data from SHARE Waves 1–9 (2004–2022), we develop and integrate a dynamic ADRD module that simulates transitions in cognitive status for individuals aged 50 and over. Cognitive decline status is defined using the Langa–Weir (LW) classification, adapted to the European context by combining episodic memory tests with functional limitations in instrumental activities of daily living (IADLs). Country-specific cut-offs are calibrated against OECD dementia prevalence benchmarks and tested for robustness across alternative SHARE waves. The classification algorithm incorporates deterministic and stochastic imputation procedures and enforces the absorptive nature of dementia over time. The ADRD module is embedded within EU-FEM’s first-order Markov Monte Carlo framework, allowing cognitive decline to evolve jointly with chronic conditions, demographic characteristics, and socioeconomic factors. Transition equations condition on prior Socio Economic Status (education, income, wealth, labour market status), health and cognition, enabling heterogeneous life-course trajectories. The enhanced model expands EU-FEM coverage to twelve European countries, including new Central and Eastern European populations, and produces internally consistent projections of dementia prevalence and cognitive trajectories. Validation exercises show close alignment with external epidemiological benchmarks and substantial improvements over self-reported dementia measures. This work demonstrates how dynamic microsimulation can be used to model cognitive decline in a harmonized, multi-country setting, providing a flexible platform for forecasting dementia prevalence and evaluating counterfactual scenarios involving risk-factor modification, prevention policies, and future care needs. The extended EU-FEM establishes a foundation for integrating cost-of-illness and long-term care modules, supporting policy analysis in ageing European societies.
Even though economists prefer the instrument of carbon pricing to reduce greenhouse gas emissions over regulatory policies, fears of regressive effects that disproportionately burden low-income households lead to low public support, hindering the political feasibility of carbon pricing. In light of an ongoing policy debate and lacking evidence for the German context, this paper investigates the distributional effects of carbon pricing in Germany, as well as possible redistributive polices. I analyze three channels through which carbon pricing affects the distributional outcome: the direct price effect, the indirect price effect and the behavioral response. I then compare a lump-sum transfer, a targeted transfer per income deciles and reductions in labor and income taxes as redistributive policies, providing evidence on their effect on counteracting adverse distributional consequences from carbon pricing.
Methodologically, this thesis extends the ifo Tax and Transfer Microsimulation Model, which is based on the GSOEP, with a carbon pricing module. This extension models the direct and indirect price effects by using household consumption data from the EVS combined with data on emission intensities and estimates the behavioral responses with Engel curves. Using the model, I simulate the introduction of a 200€ carbon price per ton of CO2-equivalent on all direct and embedded emissions in household expenditures for the year 2018.
The analysis of the simulation results yields three main findings. First, the carbon price achieves substantial emission reductions of 221 Mt CO2eq and generates 97.09 bn. € in revenue but also decreases household consumption expenditures by an average of 5,171€ per year, corresponding to an average price burden of 2,090€ per household. Second, the policy exhibits regressive distributional effects, as the price burden amounts to 8.1% of disposable income for households in the lowest income decile and 4.4% in the highest decile. Decomposing the channels shows that regressivity is mainly driven by the disproportionately high energy burden of lower-income households and their lower ability to respond to price increases. Third, the evaluation of revenue rebating policies reveals that lump-sum transfers and transfers differentiated by income deciles significantly reduce regressivity, whereas reductions in labor or income taxes have limited equality-enhancing effects. These results enable a detailed ex-ante distributional analysis by embedding carbon pricing into a microsimulation model of the German tax-benefit system. Further, the findings provide empirical evidence to a scarce study base on the distributional effects of carbon pricing and different redistributive policies for the German context. The analysis also highlights the persistence of horizontal inequalities within income groups, suggesting the need for more targeted and differentiated redistribution mechanisms. Beyond contributions to literature, this thesis provides insights for policymaking. Introducing a high carbon price requires the design of effective and equi-table rebating, considering heterogeneous effects over and within deciles.
Corporate tax microsimulation models are essential tools to estimate tax revenue, assess the impact of tax reforms, and carry out distributional analyses at the firm level.
The microsimulation model for firms currently being developed at the Bank of Italy aims at accurately simulating the effective tax burden on corporations to allow for the routine evaluation of fiscal policies geared towards the corporate sector. The model is the first to exploit information on tax liabilities contained in detailed financial statements to reconstruct the corporate income tax base at the firm level, which is next used for model validation. This represents an innovative approach to carry out micro-based model validation despite the lack of access to tax return microdata. The model can quantify the gap between financial accounting and tax accounting results, thanks to the modelling of major fiscal adjustments, such as participation exemptions, limits on interest deductibility, and tax depreciation, and replicate a few key stylized facts, such as: i) the widespread divergence between accounting and tax profits, with one fifth (one fourth) of firms with negative (positive) accounting profits showing positive (negative) tax profits; ii) the inverse U-shape of effective tax rates in firm size; iii) the greater impact of participation exemptions and interest payment deductions on large firms’ effective tax rates; iv) the countercyclical usage of loss carryforwards; v) the weaker impact on effective tax rates of accelerated depreciation schemes for intangible vs. tangible assets The work proposes an ex-ante evaluation of the most sizeable fiscal policy geared towards firms enacted with the Italian budget law in 2026, namely the introduction of a 5% participation threshold for the eligibility to the participation exemption (PEX) regime. The PEX regime seeks to eschew double taxation of corporate profits by making participation income (in the form of either dividends or capital gains) exempt. The EU “Parent-Subsidiary” Directive establishes that double taxation arises when the parent company holds, directly or indirectly, a participation stake worth at least 10%. Prior to the budget law 2026, the Italian PEX regime exempted all participation income. The newly introduced policy therefore partially aligns the Italian regime with the EU benchmark. The ex-ante distributional impact of the reform is hard to gauge. On the one hand, firms receiving participation income are less than 10% of total firms. On the other, there is a large empirical literature documenting that shareholding portfolio diversification increases with firm size. By exploiting Orbis information on the size of direct and indirect shareholdings, this work calculates the impact of the reform on firm effective tax rates, and tries to understand whether it was overall progressive, namely it redistributed the fiscal burden towards the largest firms. Finally, it discusses how potential endogenous responses in terms of portfolio reallocation might influence the results.
Given the rapidly evolving structural socio-demographic determinants in Luxembourg (ageing, migrations, cross-border households) and social needs induced, the concern about the funding of the Luxembourg social security system is high on the agenda. This concern could become even more pressing over time, given Luxembourgs specificity in several respects. According to the 2024 Ageing Report of the EPC’s Ageing Working Group, Luxembourg might face an increase of age-related expenditure from 17.2% of GDP in 2022 to 27.9% in 2070, mostly due to pensions (+ 8.3% of GDP over the period). This is the biggest increase expected in EU-countries, a first particularity for Luxembourg. A second specificity is the importance of cross-border commuters in the Luxembourg economy. These represent 43% of total employment in 2022. Given such a context, the countrys social players are looking for avenues of reflection for future concrete proposals. This paper precisely aims to contribute to such a debate. It examines day-after impact of hypothetical parametric changes in social contributions and personal income taxes (the “alternatives”) on the distribution of household disposable income and total public financial receipts from these sources for Luxembourg. Moreover and given the importance of cross-border commuters for the country, we need a microsimulation modelling EUROMOD-based covering both residents and cross-border commuters’ households, the latter population involving an essential innovative extension to previous assessments. We emphasize the structural discrepancies between residents and cross-border households in terms of socio-economic status as well as regarding gross labor and taxable income. Consequently, we show that total receipts from residents are greater than those from cross-border households, even when controlling for population size. Next, we examine 42 alternatives based on the concerns of a key Luxembourg social partner in the context of an ongoing public debate. This examination takes into account the values achieved for a triplet of standard indicators chosen for their simplicity and acceptability to a broad public, and this within the framework of an independent external expertise : total revenues (cross-border households included), the inequality Gini coefficient and the poverty rate (the latter two for the resident population only). We then complete our detailed overview with an evaluation of each alternative according to the selected dimensions altogether. Finally, we evoke a basic “Global Performance Index” which may enlighten conclusions derived from a one or two-dimensional analysis. Although the analysis is specific to Luxembourg, the policy alternatives and methodology considered here may also be relevant for other countries with comparable socio-economic fundamentals, particularly in the context of the EU-wide concern regarding the fiscal implications of population ageing.
This paper develops a behavioral microsimulation framework to analyse how carbon pricing affects energy vulnerable households in Belgium. We focus on two groups: energy poor (EP) households, who devote a large share of their income to energy, and hidden energy poor (hEP) households, who spend little on energy as they severely restrict their consumption. Using eleven cross-sections of the Belgian Household Budget Survey (2003–2016), we first document structural differences between EP and hEP households with logistic regressions. We then estimate a demographically specified Quadratic Almost Ideal Demand System (Banks et al., 1997) and develop a two-stage residual inclusion procedure to address the endogeneity of energy poverty statuses. The resulting price and income elasticities by vulnerability profiles are used to simulate responses and welfare impacts of a carbon price on heating and transport fuels, consistent with the forthcoming EU ETS 2. Behavioral adjustments are highly heterogeneous: lower-income and energy vulnerable households show higher fuel price elasticities than wealthier groups, with hEP households responding most strongly to price increases. Consequently, EP households face substantial carbon costs but comparatively smaller welfare losses, whereas hEP households suffer disproportionately large welfare losses despite their low expenditure exposure. These results reveal horizontal equity concerns that are invisible in income-based metrics alone and highlight the need to integrate detailed energy vulnerability profiles into carbon pricing design.
Population projections indicate that by 2045, the Austrian population aged 65 and older will increase by approximately 47% compared to 2023. Since the likelihood of a cancer diagnosis increases with age, a corresponding rise in cancer cases is expected. To address this and support evidence-based decision-making, a model has been developed on behalf of the Ministry of Health to project cancer incidence, prevalence, and mortality within the population up to the year 2045. Our cancer projection model builds on the microsimulation model used by Statistics Austria for official population projections (Pohl et al, 2025). It introduces a new module for calculating cancer diagnoses and refines existing ones, such as the module for calculating mortality. A key advantage of microsimulation is its ability to account for individual characteristics, allowing factors such as existing diagnoses to influence future disease states and determine cause specific mortality outcomes. In addition, microsimulation offers the possibility of further developing the model in the future, e.g. through extensions such as the consideration of risk factors, as is already done in well-known microsimulation models such as OncoSim. (Ruan et al., 2023). The model parameters are calculated using administrative register data, including the Central Population Register and Cause of Death Statistics, linked with the National Cancer Register – enabling detailed tracking of individual life histories.
References Pohl, P., Slepecki, P., & Spielauer, M. (2025). STATSIM: Statistics Austrias dynamic microsimulation model for official population projections. Statistical Journal of the IAOS, 18747655241308058. Ruan, Y., Poirier, A., Yong, J., Garner, R., Sun, Z., Than, J., Brenner, D. R. (2023) “Long-term projections of cancer incidence and mortality in Canada: The OncoSim All Cancers Model”, Preventive Medicine, Vol. 168.
This course demonstrates how to model policies in EUROMOD—the EU tax-benefit microsimulation model—using Generative AI (GenAI). By bridging traditional policy modeling with cutting-edge technology, the session shows how to simulate and validate tax and social policy rules in a practical, hands-on environment. Rather than teaching EUROMOD itself, the course uses it as a primary case study for how GenAI can accelerate the broader policy research cycle.
Integrating GenAI with microsimulation enhances the quality, speed, and scope of research. Key applications include drafting or correcting policy descriptions, verifying model alignment with legislation, implementing code, and performing micro/macro-validation. As important caveats remain, the session also addresses best practices to ensure reliable results.
Drawing on my expertise in both microsimulation and GenAI, I provide a practical framework for how these tools complement one another. Having previously led GenAI workshops across various research settings, my goal is to equip analysts with the latest skills to produce timely, reliable, and actionable policy evidence.
A recent and growing literature studies policy preferences empirically using survey experiments. A frequent approach in welfare economics, cost-benefit-analysis and political economy is to aggregate policy preferences assuming rationality and self-interest. In this paper we compare the two approaches in a specific policy domain, the tax treatment of married couples. We explore whether political economy arguments can explain the persistence of joint taxation in Germany. First, we use a sufficient statistics approach rooted in utility-maximizing behavior. With this approach we estimate the population shares of winners and losers from a revenue-neutral reform towards individual taxation, according to material self-interest. Second, we report on a large scale survey experiment to elicit stated preferences among a representative sample of the German population. Both methods show that the tax treatment of couples in Germany is highly controversial. Based on material self-interest, the support for a reform towards individual taxation is slightly below the majority threshold. Since the race is so close, views on just taxation or on the status of marriage can actually make a difference. We find that, according to the preferences stated in the survey experiment, support for a reduction of marriage bonuses is lower than suggested by the analysis based on self-interest.