Is machine learning able to predict the downfall of societies?
Throughout human history, great empires have repeatedly risen and fallen. The Maya, Rome, Byzantium. What makes these cases so fascinating is the question of how civilizations that achieved extraordinary levels of advancement could nevertheless fragment, collapse, or even disappear entirely. This topic has long been debated in historical scholarship. Books such as Jared Diamond’s “Collapse” examine historical case studies and conclude that modern societies are not immune to decline either.
Personally, I became interested in whether these conclusions could be examined in a data-driven way and whether machine learning might even be capable of forecasting such collapses in the future. In essence, I attempted to address the “Holy Grail” question of cliodynamics, greetings to Peter Turchin: Are the mathematical patterns of societal decline universal?
To do this, I combined qualitative research on the collapse of societies with quantitative data. The perfect source for this was Seshat, the ancient Egyptian goddess of writing, or more precisely, the Seshat Global History Databank, an international research project founded in 2011 for the quantitative comparative analysis of historical societies. The database contains structured information on hundreds of political systems spanning several thousand years and translates historical developments into comparable variables such as administrative capacity, military organization, infrastructure and institutional complexity.
The idea was to identify well-known historical collapse cases and allow the model itself to learn which structural patterns typically preceded them. Out of all political systems contained in the dataset, around 30 historical cases were selected in which the scholarly literature shows a relatively broad consensus regarding severe phases of institutional destabilization, territorial fragmentation, external conquest, or fundamental systemic transformation.
The definition of societal or state “collapse” is itself one of the most contested issues in historical and social scientific research. While some scholars such as Jared Diamond primarily emphasize ecological overexploitation, resource scarcity, and poor decision-making, Joseph Tainter interprets collapse more as the loss of societal complexity and administrative capability. Other approaches focus on political instability, elite conflict, institutional decay, economic crises, or external conquest. As a result, no universal scientific consensus exists regarding when a society has truly “collapsed.”
The cases were therefore not selected randomly, but according to several criteria. First, they had to be historically well documented in order to allow transparent qualitative coding. Second, preference was given to societies repeatedly discussed in the comparative collapse literature as examples of state breakdown, imperial overstretch, or institutional erosion, such as the Akkadian Empire, the late Roman Republic, the Ming Dynasty, the Classic Maya system, or the Inca Empire. Third, care was taken to include different forms of societal disintegration. The dataset therefore includes not only classic state collapses, but also territorial fragmentation, external conquests, dynastic disintegration processes, and deep political transformations such as the transition from the Tokugawa Shogunate to the Meiji Restoration.

These examples also demonstrate that “collapse” can take very different forms historically. While the Akkadian Empire or the Inca Empire disintegrated relatively abruptly, cases such as the Roman Republic or Tokugawa Japan represented deep political transformations with continuing societal continuity. The purpose of the qualitative coding was therefore primarily to identify recurring structural patterns of severe destabilization across different historical eras.
But states have not only collapsed in ancient history. In modern times, Yugoslavia and the Soviet Union also disintegrated. Scholars likewise speak of “failed states” such as Haiti, Iraq, or Afghanistan. I therefore expanded the dataset with three additional modern data sources: the World Bank, the Varieties of Democracy Institute (V-Dem) and conflict-related data from the Uppsala Conflict Data Program (UCDP). These datasets contain information on economic development, population dynamics, democracy, corruption, rule of law and the frequency and intensity of armed conflict, thereby enabling a quantitative analysis of modern institutional stability and fragility.
Of course, a chariot army in ancient Mesopotamia cannot be directly compared to nuclear deterrence or modern cyber warfare. Abstract categories therefore had to be created in order to harmonize the datasets into shared meta-features. These include administrative capability, military organization, infrastructure, information systems and settlement complexity. The model thus compares not individual technologies, but structural functions of societies, for example how effectively a state processes information, controls territory or organizes resources.

After harmonization, both datasets were merged into a common analytical dataset. The resulting model now contains several thousand historical and modern observations within a shared structural comparison space. Instead of comparing individual civilizations directly, the model analyzes recurring institutional patterns that historically appeared before phases of severe destabilization or systemic crisis.
For the machine learning algorithm, I deliberately chose a Random Forest approach because the dataset is relatively small and historically heterogeneous. Random Forests work robustly with mixed variable types, are less sensitive to outliers, and allow for substantially better interpretability of results. Neural networks, for example, often function as “black boxes” and require extremely large datasets, whereas Random Forests allow transparent analysis of the most important influencing variables and are therefore better suited for exploratory social scientific research with limited case numbers.
To train the algorithm, I focused on 166 historical systems with sufficient data coverage, including around 30 collapse cases, plus 13 modern collapse states. In order to focus specifically on the pre-collapse period, temporal windows were defined based on theoretical assumptions from historical collapse and state failure research. For historical societies, a period of 200 years before documented disintegration or transformation processes was selected, since institutional erosion, fiscal overstretch, elite conflict or administrative fragmentation in premodern empires often unfolded over many generations. Peter Turchin’s work on long-term “secular cycles” and Joseph Tainter’s theory of the gradual loss of societal complexity both describe historical collapse processes as long-term structural developments rather than sudden isolated events (Tainter, 1988; Turchin & Nefedov, 2009).
For modern states, by contrast, a much shorter pre-collapse window of 40 years was used. Modern political systems change far more rapidly due to accelerated communication, economic, and conflict dynamics. Research on state failure and civil wars suggests that institutional destabilization in modern states often escalates within only a few decades, for example through civil wars, ethnopolitical fragmentation, economic crises, or external interventions (Rotberg, 2004; Fearon & Laitin, 2003). The differing time windows were therefore intended to reflect the fundamentally different historical speeds of societal transformation. In order to avoid mixing short-term noise in modern annual data with long-term historical developments, modern time series were smoothed using rolling averages so that only longer-term trends were interpreted.
During training and the first evaluations of the predictions, it quickly became apparent that the system mainly identified chronically fragile modern states. However, the goal was not merely to identify fragile systems, but also states whose institutional structure visibly deteriorates over time. A poor but stable country is not the same as a state experiencing rapidly declining structural capacity, where the probability of collapse becomes significantly higher.
To improve the model’s ability to distinguish these cases, I integrated so-called dynamic features into the analysis. These features capture not only the absolute condition of a system, but also how rapidly its structure changes over time. For modern states, multi-year changes in key structural variables were calculated, including administrative capacity, infrastructure capability, military strain, and institutional stability.
For historical societies, by contrast, changes between documented observation phases were used, since historical time series often contain larger temporal gaps and do not provide annual data. Modern states, however, could be analyzed using standardized ten-year windows. This allowed the model to better distinguish between permanently weak but stable systems and genuinely escalating destabilization processes.
For training, the data were split into 80% training data and 20% test data. Despite the relatively small number of cases, the model’s performance on the test set proved remarkably stable. The Random Forest correctly identified 95 out of 99 stable cases and correctly classified 64 out of 73 pre-collapse cases. Only 4 stable cases were incorrectly flagged as collapse patterns (false positives), while 9 pre-collapse cases were missed. This corresponds to an overall accuracy of 92.4%. Precision reached 94.1%, while the combined F1-score of approximately 0.91 indicates a robust balance between sensitivity and precision.
A closer look at the variables most important for the identification of pre-collapse phases reveals that societal transformation and the ability of states to process information play a particularly important role. Historical and modern pre-collapse phases appear to be characterized less by isolated shocks than by structural change and institutional overload.

The proxy “Settlement Change” for example, captures shifts in population and settlement structures, urbanization, demographic concentration, and migration patterns. Historically, one frequently observes strong migration flows, urban crises, depopulation of regions, refugee movements, supply crises, and overstretched cities before major crises or state collapse, such as during the late Roman urban crisis.
“Information Complexity” indirectly includes democracy, rule of law, openness of information systems, and institutional transparency. This suggests that the model may indeed recognize that closed, weakly information-processing and highly controlled systems more frequently resemble historical pre-collapse phases. Theoretically, this is highly plausible. Many studies on state failure show that information blockages, elite isolation, weak feedback mechanisms and the inability to tolerate criticism are typical crisis patterns.
The model actually identifies relatively clear structural patterns preceding both historical and modern destabilization processes. Particularly striking is that short-term crisis variables are not dominant. Instead, long-term structural features of societal organization appear far more important. This suggests that many historical and modern crisis systems were primarily shaped by tensions within complex organizational networks.
Interestingly, military organization appears somewhat less dominant than expected. This argues against simplistic “war causes collapse” explanations and instead supports models of institutional overstretch and structural erosion as described by scholars such as Joseph Tainter or Peter Turchin. At the same time, the results show that complexity alone does not cause collapse. Rather, collapse risk appears to increase when highly complex organizational systems coincide with declining institutional coherence, infrastructural strain or weakening administrative capability.
The trained model was then applied to the dataset of the remaining modern states. The model estimates the probability that a modern system exhibits structural similarities to historical pre-collapse configurations.

The highest model scores are concentrated disproportionately in states characterized by heavy institutional strain, limited administrative resilience, strong dependence on centralized power structures, or persistent geopolitical tensions. Particularly authoritarian and highly centralized systems, states with weak infrastructure, or countries affected by long-term conflict dynamics display elevated similarities to historical stress patterns.
The fact that China achieves relatively high scores despite its economic and technological strength suggests that large, highly complex systems historically often generate similar patterns of institutional tension, regardless of whether they are democratic or authoritarian.
Particularly interesting is that even stable Western democracies do not approach zero values. Germany, France, or Norway rank substantially lower than fragile systems, but still exhibit certain patterns historically associated with phases of increasing complexity, institutional strain, or societal tension. The model therefore seems to measure less a state of absolute stability than the degree of institutional, demographic or organizational pressure a system must simultaneously absorb.

The fact that India also reaches comparatively high values does not necessarily imply impending destabilization. More likely, the model recognizes characteristics historically associated with heavily stressed large-scale systems, such as enormous population density, high infrastructural complexity, strong centralization and accelerated societal transformation.
Despite their economic strength, the United States ranks above many European states. The model may be responding here particularly to polarization, institutional tensions and the growing complexity of a global power center, factors historically more often associated with late-imperial stress phases in large systems.
Returning to the question of cliodynamics. Whether my machine learning model has truly uncovered the “Holy Grail” of historical prediction remains a matter of perspective. What seems clear, however, is that history may not repeat itself exactly, but it leaves behind mathematical patterns. The model suggests that modernity and antiquity speak a surprisingly similar language when confronted with structural pressure. And that the collapse of societies is not a sudden fate, but rather the mathematical result of complexity itself. The more complex societies become, the more important the question becomes whether they can still adapt quickly enough to their own transformation. – by Maike Martina Heinrich – May 2026
Title Photo: Dave Photoz on Unsplash
Code: https://github.com/Maike-H/Glimpse/blob/main/Collapse
References:
Acemoglu, D., & Robinson, J. A. (2012). Why nations fail: The origins of power, prosperity, and poverty. Crown Business.
Butzer, K. W. (2012). Collapse, environment, and society. Proceedings of the National Academy of Sciences, 109(10), 3632–3639. https://doi.org/10.1073/pnas.1114845109
Cederman, L.-E., Wimmer, A., & Min, B. (2010). Why do ethnic groups rebel? New data and analysis. World Politics, 62(1), 87–119.
Center for Systemic Peace. (n.d.). Polity Project. Retrieved May 10, 2026, from https://www.systemicpeace.org/polityproject.html
Climate Research Unit. (n.d.). CRU time-series datasets. Retrieved May 10, 2026, from https://crudata.uea.ac.uk/cru/data/hrg/
Cline, E. H. (2014). 1177 B.C.: The year civilization collapsed. Princeton University Press.
Diamond, J. M. (2005). Collapse: How societies choose to fail or succeed. Viking Press.
Fearon, J. D., & Laitin, D. D. (2003). Ethnicity, insurgency, and civil war. American Political Science Review, 97(1), 75–90.
Food and Agriculture Organization of the United Nations. (n.d.). FAOSTAT. Retrieved May 10, 2026, from https://www.fao.org/faostat/en/
Fukuyama, F. (2011). The origins of political order: From prehuman times to the French Revolution. Farrar, Straus and Giroux.
Fukuyama, F. (2014). Political order and political decay: From the Industrial Revolution to the globalization of democracy. Farrar, Straus and Giroux.
Goldstone, J. A. (1991). Revolution and rebellion in the early modern world. University of California Press.
Integrated Public Use Microdata Series. (n.d.). IPUMS International. Retrieved May 10, 2026, from https://international.ipums.org/international/
Mann, M. (1986). The sources of social power: Volume 1, A history of power from the beginning to A.D. 1760. Cambridge University Press.
Mann, M. (1993). The sources of social power: Volume 2, The rise of classes and nation-states, 1760–1914. Cambridge University Press.
Middleton, G. D. (2017). Understanding collapse: Ancient history and modern myths. Cambridge University Press.
Migdal, J. S. (1988). Strong societies and weak states: State-society relations and state capabilities in the Third World. Princeton University Press.
Motesharrei, S., Rivas, J., & Kalnay, E. (2014). Human and nature dynamics (HANDY): Modeling inequality and use of resources in the collapse or sustainability of societies. Ecological Economics, 101, 90–102.
North, D. C., Wallis, J. J., & Weingast, B. R. (2009). Violence and social orders: A conceptual framework for interpreting recorded human history. Cambridge University Press.
Organisation for Economic Co-operation and Development. (n.d.). OECD Data. Retrieved May 10, 2026, from https://www.oecd.org/en/data.html
Our World in Data. (n.d.). Our World in Data. Retrieved May 10, 2026, from https://ourworldindata.org/
Rotberg, R. I. (Ed.). (2004). When states fail: Causes and consequences. Princeton University Press.
Scheffer, M. (2009). Critical transitions in nature and society. Princeton University Press.
Scheidel, W. (Ed.). (2015). State power in ancient China and Rome. Oxford University Press.
Scott, J. C. (2017). Against the grain: A deep history of the earliest states. Yale University Press.
Seshat: Global History Databank. (n.d.). Seshat Global History Databank. Retrieved May 10, 2026, from https://seshatdatabank.info/
Tainter, J. A. (1988). The collapse of complex societies. Cambridge University Press.
The World Bank. (n.d.). World Development Indicators. Retrieved May 10, 2026, from https://data.worldbank.org/
Toynbee, A. J. (1934–1961). A study of history (Vols. 1–12). Oxford University Press. (Hinweis: In deiner Liste stand 1946–1957, das Erscheinungsintervall der Gesamtausgabe erstreckt sich jedoch von 1934 bis zum finalen Atlas/Index-Band 1961).
Toynbee, A. J. (1972). A study of history (Illustrated single-volume ed.). Oxford University Press.
Turchin, P. (2003). Historical dynamics: Why states rise and fall. Princeton University Press.
Turchin, P. (2016). Ultrasociety: How 10,000 years of war made humans the greatest cooperators on Earth. Beresta Books.
Turchin, P., & Nefedov, S. A. (2009). Secular cycles. Princeton University Press.
Uppsala Conflict Data Program. (n.d.). UCDP data downloads. Retrieved May 10, 2026, from https://ucdp.uu.se/downloads/
Varieties of Democracy Institute. (n.d.). Varieties of Democracy (V-Dem) dataset. Retrieved May 10, 2026, from https://www.v-dem.net/
Yoffee, N. (2005). Myths of the archaic state: Evolution of the earliest cities, states, and civilizations. Cambridge University Press.
Yoffee, N., & Cowgill, G. L. (Eds.). (1988). The collapse of ancient states and civilizations. University of Arizona Press.



Leave a Reply