While prediction markets in decentralized finance aggregate dispersed information into probabilistic prices, their accuracy strongly dependents on design, incentives, and environment. Under suitable incentives, markets can theoretically approximate the likelihood of events (Wolfers & Zitzewitz, 2004). However, recent agent-based research suggests that in prediction markets with order-based matching, biased traders with large capital endowments (so-called “whales”) can move prices away from fundamentals. Such distortions may persist when updating beliefs is slow and herding reinforces such deviations (Smart et al., 2026). Such distortion dynamics thus depend not only on trader heterogeneity but also on the mechanisms translating demand into prices and spreads.
We analyze this challenge at the micro-level by simulating an automated market maker building on the principles of logarithmic market scoring rule (LMSR). LMSR derives prices from a convex cost function over outstanding shares and secures continuous liquidity with bounded worst-case loss (Hanson, 2003; Chen & Pennock, 2007). Unlike price-impact systems in which demand moves prices without relevant lag (Smart et al., 2026), LMSR embeds liquidity in the curvature of the cost function. This structural difference alters how capital accumulation and biased trading affect price formation. Our contribution lies not just in the replication of an order-book market but rather in an analysis of distortion mechanisms under scoring-rule-based liquidity provision.
Our model approximates a binary event market with heterogeneous traders, including informed traders with high-precision signals, noise traders generating random order flows, arbitrage-oriented traders reacting to publicly available information, strategic traders with biased valuations and larger budgets, and conviction-driven traders who update beliefs slowly. A latent “true” probability evolves stochastically with occasional shocks. Agents update beliefs in log-odds space using heterogeneous learning weights and submit trades proportional to the gap between belief and LMSR price, scaled by both risk aversion and trading intensity.
Performance is evaluated dynamically and at settlement. During trading we trace mispricing (price - true probability), volatility, volume, and wealth distribution across traders. At settlement, forecasting accuracy is measured using both Brier score and log loss. The experimental design focuses on three questions: (i) how the LMSR liquidity parameter affects the trade-off between responsiveness and stability; (ii) how large and persistent distortions become under high-budget biased trading; and (iii) how liquidity design interacts with stubborn belief updating.
In our preliminary analyses, we test whether distortion mechanisms identified in price-impact order-matching environments (Smart et al., 2026) also emerge under scoring-rule-based automated market makers and to which extent liquidity curvature mitigates persistence. This work’s limitations emerge primarily from generalized trading strategies and abstracted order-book dynamics. It is further limited by the exogeneity of true probabilities.
References
Chen, Y., & Pennock, D.M. (2007). A Utility Framework for Bounded-Loss Market Makers. Conference on Uncertainty in Artificial Intelligence. 49-56. Hanson, R. (2003). Combinatorial Information Market Design. Information Systems Frontiers, 5(1), 107-119. Smart, J., et al. (2026). Manipulation in Prediction Markets: An Agent-Based Modeling Experiment. ArXiv. Wolfers, J., & Zitzewitz, E. (2004). Prediction Markets. Journal of Economic Perspectives, 18(2), 107–126.
Population ageing represents one of the most significant long-term challenges for social and health care systems in Europe, with powerful implications for long-term care infrastructure. In Slovakia, demographic ageing is expected to intensify markedly over the coming decades, generating substantial pressure on the capacity and spatial distribution of residential care facilities for older persons. This paper aims to quantify the future demand for residential senior day-care facilities using the Slovak Labour Microsimulation Model (SLAMM). The analysis employs a dynamic microsimulation approach that models individual life-course transitions related to ageing, household structure, health status, and care dependency. Based on demographic projections of the Slovak population, the model simulates the evolution of the senior population and estimates the number of individuals requiring institutional or semi-institutional care services. The analysis is conducted at the district (LAU 1) level, allowing for the identification of significant regional disparities in both demographic ageing and projected care needs. The primary output of the model is a district-level forecast of demand for residential day-care facilities for seniors over the medium- and long-term horizon. Building on these demand estimates, the paper further quantifies the investment costs associated with expanding care infrastructure. Using unit cost estimates for the construction and capacity expansion of senior care facilities, the study derives projected capital expenditure requirements necessary to meet future demand under different demographic scenarios. The results provide a comprehensive picture of how population ageing will translate into spatially differentiated demand for senior care services in Slovakia. By linking microsimulation-based demand projections with infrastructure cost estimates, the paper offers an integrated analytical framework that supports evidence-based planning of long-term care capacity. The findings are particularly relevant for policymakers and regional authorities responsible for social service provision, as they highlight districts facing the most significant future capacity gaps and investment needs. Overall, the study demonstrates the usefulness of microsimulation modelling as a tool for strategic planning in the area of ageing and long-term care, enabling policymakers to anticipate demographic pressures and to design more efficient and equitable infrastructure investment strategies.
Alignment is a critical calibration technique in microsimulation, ensuring individual-level transitions aggregate to known macro-targets. While indispensable for updating populations to match demographic projections or macroeconomic forecasts, the statistical properties of various alignment algorithms remain under-researched. This paper provides a systematic evaluation of alignment methods for discrete-time models to guide researchers in method selection.
We categorize existing methods based on a robust taxonomy: variable type (continuous vs. discrete), the nature of the outcome (exact match vs. stochastic), and time-step logic (continuous vs discrete). Building on the work of Stephenson (2018), we demonstrate that most contemporary alignment methods can be unified under a single constrained optimization framework, differing only in their choice of objective functions. This unification allows us to establish previously unrecognized relationships between seemingly disparate techniques.
Furthermore, we derive closed-form expressions and Taylor series approximations for computationally intensive iterative methods. These mathematical shortcuts allow modellers to reduce simulation run-times significantly while converting exact alignment processes into stochastic ones without losing desirable statistical properties. We then study the impact of these methods on the predicted distributions and covariance of outcomes between simulation runs.
Our analysis reveals that common techniques, such as Sorting by Predicted Probability (SBP) and Sorted By Difference between predicted Probability and a random number (SBD), are undesirable for microsimulation modelling as they can introduce biases and path dependence into the simulation outcomes. In contrast, parameter variation methods (i.e. recalibration of model coefficients such as intercept shifting) offer modellers control over the impact of alignment and better interpretability. Furthermore, methods like logit scaling, variance-weighted additive scaling for probabilities, and additive scaling for continuous variables offer greater robustness due to their foundations in probability theory. We conclude by providing a decision matrix for modellers, weighing the trade-offs between computational efficiency, distribution preservation, and variance reduction.
Matteo Richiardi
(
Centre for Microsimulation and Policy Analysis (CeMPA), University of Essex
)
—
“Recent developments of the SimPaths dynamic microsimulation framework”(joint work with: Ashley Burdett, Aleksandra Kolndrekaj, Daria Popova, Liang Shi, David Sonnewald, Mariia Vartuzova, Justin van de Ven, Matteo Richiardi)
SimPaths is an open-source framework for modelling individual and household life course events, jointly developed at the Centre for Microsimulation and Policy Analysis and the University of Glasgow (Bronka et al., 2025). The framework is designed to project life histories through time, building up a detailed picture of career paths, family (inter)relations, health, and financial circumstances. The modular nature of the SimPaths framework is designed to facilitate analysis of alternative assumptions concerning the tax and benefit system, sensitivity to parameter estimates and alternative approaches for projecting labour/leisure and consumption/savings decisions. SimPaths builds upon standardised assumptions and data sources, which facilitates adaptation to alternative countries.
The presentation will focus on recent developments of the SimPaths framework, centred around: • Introduction of new health and well-being variables (MCS, PCS, life satisfaction) for the UK model • Development of a new wealth module for the UK model • Updated release for the model for Italy • First release for the models for Poland, Spain, Germany, Spain • Development of macro modules (all countries) • Development of migration modules (all countries) • Development of inter-household transfer modules (all countries) • Improved model documentation
Strategies for internal validation will also be discussed.
Interstate conflict networks are frequently represented as mappings of dyadic feature combinations, which can be represented as graph-based networks. Such graphs encode structured macro-patterns, including but not limited to positions, clusters, and polarization, that require mechanistic interpretations and explanation. The research underlying this abstract develops an agent-based model (ABM) to reconstruct annual interstate conflict networks derived from the Conflict Barometer of the Heidelberg Institute for International Conflict Research (HIIK) (HIIK, various years). Rather than treating these networks as graphs and illustrations, we utilize them as empirical macro-targets for simulation-based explanation in the tradition of generative social science (Epstein, 2006).
Between 2014 and 2024, and on a yearly basis, the HIIK Conflict Barometer systematically coded political conflicts worldwide using a qualitative, measure-based methodology and an ordinal intensity scale (1-5), distinguishing disputes, crises, and wars. The resulting annual interstate networks can be used to represent states as nodes and conflict-intensity as weighted, undirected edges. These graphs, in turn, reveal recurring structural features, such as high-betweenness states, state clustering, and varying intensity distributions. However, these graphs cannot provide insights into the micro-processes that generate them. Research in international conflict networks emphasizes that dyadic conflict behavior is embedded in broader relational structures characterized by diffusion and bloc formation (Maoz, 2010), reinforcing the need for dynamic modeling.
To address this gap and support the future prediction of dyadic state combinations with the potential to escalate, we construct an ABM at monthly granularity in which state agents are endowed with heterogeneous capabilities, institutional characteristics, and endogenous stress variables. In our model, dyadic conflict intensity evolves through probabilistic escalation and de-escalation processes driven by baseline tension and retaliation, cost sensitivity and fatigue, network spillover or conflict transmissibility, as well as alignment pressure via triadic closure and alliances. Third-party mediation mechanisms further enable mediation effects to emerge endogenously. Annual networks are obtained by aggregating simulated monthly intensities. They are then evaluated and calibrated through comparison with HIIK data, including the empirical distribution of dyadic intensity levels, rank stability of high-betweenness states, network density and component structure, as well as modular clustering patterns. Our validation relies on out-of-sample yearly comparisons and tests.
Our preliminary results indicate that a parsimonious combination of dyadic retaliation and network-mediated spillovers can already be used to reproduce mediation and polarization patterns observed in HIIK networks, while alliance-related parameters affect cluster rigidity rather than aggregate conflict prevalence. Our findings thus suggest that interstate conflict networks are best understood as macro-representations of micro-interactions rather than static representations of antagonism. Our work further demonstrates how conflict data can anchor ABM calibration. Substantively, it clarifies structural drivers of mediation and alliance formation in interstate conflicts and provides a framework for counterfactual and experimental analyses.
References
Epstein, J. M. (2006). Generative Social Science. Princeton University Press. Heidelberg Institute for International Conflict Research (HIIK). (Various years). Conflict Barometer. Heidelberg. Maoz, Z. (2010). Networks of Nations. Cambridge University Press.
A recent and growing literature studies policy preferences empirically using survey experiments. A frequent approach in welfare economics, cost-benefit-analysis and political economy is to aggregate policy preferences assuming rationality and self-interest. In this paper we compare the two approaches in a specific policy domain, the tax treatment of married couples. We explore whether political economy arguments can explain the persistence of joint taxation in Germany. First, we use a sufficient statistics approach rooted in utility-maximizing behavior. With this approach we estimate the population shares of winners and losers from a revenue-neutral reform towards individual taxation, according to material self-interest. Second, we report on a large scale survey experiment to elicit stated preferences among a representative sample of the German population. Both methods show that the tax treatment of couples in Germany is highly controversial. Based on material self-interest, the support for a reform towards individual taxation is slightly below the majority threshold. Since the race is so close, views on just taxation or on the status of marriage can actually make a difference. We find that, according to the preferences stated in the survey experiment, support for a reduction of marriage bonuses is lower than suggested by the analysis based on self-interest.
This paper analyses the distributional effects of a fiscally neutral tax reform in Croatia that shifts the tax burden from consumption to labour income, capital income, and property. Such a reform can be considered justified given the imbalances of the Croatian tax system, which is characterised by an exceptionally high share of indirect taxes and relatively low taxation of labour, capital, and property income compared to the EU average, contributing to regressivity and greater income inequality.
The proposed reform consists of increasing the progressivity of the personal income tax (through the introduction of higher tax rates), raising taxation of capital income and property, while simultaneously reducing the standard VAT rate and expanding the reduced VAT rate. The analysis is based on the application of microsimulation models of direct and indirect taxes, using EU-SILC and Household Budget Survey (HBS) data, with corrections to the income distribution using administrative data to ensure a credible simulation of direct and indirect taxes.
In addition to distributional effects, the paper also assesses the impact of the reform on efficiency using a behavioural labour supply model. The results show that the reform does not have a significant negative effect on labour supply, increases the fairness of the tax system by reducing inequality, and delivers the largest gains to lower-income households and more vulnerable social groups, while the richest households incur losses due to increased direct tax burdens. Overall, the reform increases progressivity and the redistributive effect of the system without compromising economic efficiency.
Ireland is an outlier among European countries in that it has no automatic indexation of tax and welfare parameters. As a result, the annual Budget announcement plays a significant role, both in shaping the government’s discretionary policy priorities and in influencing people’s expectations about the future. In this project, we study Budget Day as a signalling device along two margins. First, we ask how discretionary policy choices across spending areas respond to political economy forces: media attention, organised interest-group pressure, and electoral cycles. Second, we examine whether Budget announcements shift individuals beliefs about redistribution and uncertainty about the future, and whether these responses are heterogeneous by predicted exposure to the announced measures. In terms of conceptual framework and methods, we build on the literature on central bank communication, which uses language analysis and treats policy announcements as signals that shape expectations (Hansen et al., 2018; Ehrmann and Talmi, 2020). A related strand of work studies the political economy of budget trade-offs (Adolph et al., 2020) and the role of information flows in distinguishing anticipated from unanticipated fiscal policy in budget announcements (Leeper et al., 2013). Balladares and Garcia-Miralles (2025) trace fiscal drag in Spain over time, while in the Irish context (Boyd and McIndoe-Calder, 2025) the focus is on personal income tax over a shorter horizon (2019-2023), using microsimulation modelling with EUROMOD. However, there is little long-run evidence on Irish budget cycles or their political determinants beyond the aggregate analysis in Cousins (2007). In the first part of this paper, we build a new historical dataset of Irish budgets from the mid-1990s onward, coding discretionary changes by broad policy area. We benchmark each years policy package against a counterfactual that mechanically indexes key tax and welfare parameters to price and wage inflation. Deviations from this benchmark provide a transparent measure of discretionary policy priorities. We then relate these deviations to (i) topic-level media salience indices constructed using text analysis of newspaper coverage, (ii) measures of lobbying intensity proxied by pre-budget submissions and public representations by sector, and (iii) electoral timing and macroeconomic conditions. In the second part, we analyse individual-level survey data to examine perceptions of redistribution, fairness, and uncertainty about the future in relation to budget announcements. Microsimulation using EUROMOD allows us to account for heterogeneity in responses, in particular whether individuals or households that are likely to gain or lose from policy changes react differently. We exploit variation in interview timing relative to Budget Day using the Irish sample of the European Social Survey, complemented by Eurobarometer data. Our work provides a systematic assessment of Irish budget priorities and contributes to the debate on whether indexation would reduce fiscal drag and depoliticise distributional choices, or whether discretionary budgets are instead responsive to public preferences.
Because causal estimands are unobservable in reality, benchmarking causal effect estimators is inherently challenging. To overcome this, simulation studies are increasingly used to evaluate causal inference methods under realistic conditions such as small sample sizes, limited overlap, and complex confounding. Yet it remains unclear to what extent conclusions drawn from simulated settings extrapolate to real-world performance. Existing benchmarks rely on a wide range of outcome-generation strategies, from parametric structural models to flexible machine learning, without clear guidance on their reliability. This paper formalizes simulation-based benchmarking and introduces a framework to characterize the types of bias that may arise from different generative strategies, focusing particularly on Average Treatment Effect (ATE). We classify and compare several approaches proposed in the causal inference literature, including parametric structural models[1], machine-learning conditional mean–variance models[2], and adversarial generative models such as Wasserstein GANs[3]. We apply these modeling strategies to commonly used benchmark datasets in which we synthetically define an outcome-generating function so that the true ATE is known and systematic evaluation is possible. Across Monte Carlo experiments, we compare estimator performance in terms of bias, variance, and mean squared error. We examine how closely these performance metrics align across different simulation strategies relative to the structural data-generating process, and whether commonly used generator diagnostics are predictive of estimator behavior. Our preliminary results indicate that the choice of simulation strategy can substantially alter conclusions about estimator reliability relative to the underlying data distribution. These findings underscore the importance of principled design and validation of simulation frameworks when benchmarking causal methods.
[1] T. Wendling, K. Jung, A. Callahan, A. Schuler, N. H. Shah, and B. Gallego, “Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases,” Stat. Med., vol. 37, no. 23, pp. 3309–3324, Oct. 2018, doi: 10.1002/sim.7820. [2] A. Schuler, K. Jung, R. Tibshirani, T. Hastie, and N. Shah, “Synth-Validation: Selecting the Best Causal Inference Method for a Given Dataset,” Oct. 31, 2017, arXiv: arXiv:1711.00083. doi: 10.48550/arXiv.1711.00083. [3] S. Athey, G. W. Imbens, J. Metzger, and E. Munro, “Using Wasserstein Generative Adversarial Networks for the design of Monte Carlo simulations,” J. Econom., vol. 240, no. 2, p. 105076, Mar. 2024, doi: 10.1016/j.jeconom.2020.09.013.
Dynamic microsimulation models such as SimPaths are increasingly used to evaluate long-term policy impacts by generating synthetic trajectories for individuals and households. Their credibility, however, depends on rigorous validation: demonstrating that simulated outcomes can reliably reproduce observed data. Despite their growing role in policy analysis, validation practices remain fragmented and only partially automated (e.g., O’Donoghue et al., 2015; Gosseries & Van der Heyden, 2018). This paper presents ongoing work on strengthening validation frameworks in SimPaths, with a focus on discriminator-based methods and econometric consistency checks. First, we apply classifiers (e.g., Gradient Boosted Machines) to distinguish between simulated and survey data. Discriminator accuracy provides an interpretable quantitative score of similarity: the closer performance is to random guessing, the more realistic the simulated data. Second, we explore the re-estimation of key behavioural regressions using simulated data and assess parameter recovery. This helps identify whether discrepancies arise from implementation issues, estimation limitations, or structural differences between datasets. By combining machine-learning discriminators with regression-based diagnostics, the paper contributes to more automated, transparent, and reproducible validation practices for complex microsimulation models.