Each exercise is scored 1–5 across 8 framework dimensions plus 4 implementation metrics. Scores are derived from published AARs, peer-reviewed literature, de-classified government reports, and FOI documents.
Homeland Security Exercise and Evaluation Program. Seven-type taxonomy: Seminar → Workshop → TTX → Game → Drill → Functional Exercise → Full-Scale Exercise. Progressive complexity scale maps to operational realism. Score: 1 = Seminar/Workshop, 2 = TTX, 3 = Game/War Game, 4 = Functional Exercise, 5 = Full-Scale Exercise.
FEMA (2020). Homeland Security Exercise and Evaluation Program Doctrine. Rev 2-2-25. fema.gov/emergency-managers/national-preparedness/exercises/hseep
WHO Simulation Exercise Manual — 7-component lifecycle: Selection, Planning, Scenario Development, Pandemic Description, Evaluation Planning, Staging, Post-Exercise Actions. Score: number of WHO-EPPP components fully evidenced (1–5 scale, where 5 = all 7 components documented, 4 = 5–6, 3 = 3–4, 2 = 1–2, 1 = none).
WHO (2017). Simulation Exercise Manual. WHO-WHE-CPI-2017.10. who.int/publications/i/item/WHO-WHE-CPI-2017.10 • Reddin et al. (2021). PMC8020603.
Global Health Security Index — 6 categories: Prevention, Detection & Reporting, Rapid Response, Health System, Compliance with International Norms, Risk Environment. 37 indicators, 96 sub-indicators, 171 questions. Score: number of GHS categories meaningfully tested (1–5 scale, 5 = all 6 categories).
NTI/JHU/EIU (2021). GHS Index Methodology. ghsindex.org/wp-content/uploads/2021/11/2021_GHSindex_Methodology_FINAL.pdf
NATO Wargaming Handbook — 3-type taxonomy (Educational, Experiential, Analytical) with 5-phase lifecycle (Initiate, Design, Develop, Execute, Analyse). Includes deductive/inductive/abductive analysis modes. Score: 1 = Educational only, 2 = Experiential, 3 = Analytical (single method), 4 = Analytical (mixed methods), 5 = Full analytical with multi-phase lifecycle.
NATO ACT (2023). Wargaming Handbook. act.nato.int • NATO WIN24. act.nato.int/wp-content/uploads/2024/08/WIN24_Booklet.pdf
UK MOD Wargaming Handbook — Classifies: Seminar, COA Wargame, Matrix Game, Kriegsspiel, Business Wargame. 5-step lifecycle linked to Defence Cycle of Research. Score: 1 = Seminar-level, 2 = COA, 3 = Matrix game, 4 = Kriegsspiel/complex, 5 = Full defence research cycle integrated.
UK MOD DCDC (2017). Wargaming Handbook. assets.publishing.service.gov.uk • professionalwargaming.co.uk
UK MOD Red Teaming Handbook — Structured adversarial thinking: Alternative Analysis, Pre-Mortem, Devil's Advocacy, Team A/B. Score: 1 = No adversarial element, 2 = Basic contrarian role, 3 = Structured red cell, 4 = Multi-method red teaming, 5 = Independent red team with full alternative analysis.
UK MOD DCDC (2021). Red Teaming Handbook, 3rd Ed. gov.uk/government/publications/a-guide-to-red-teaming
Shell/GBN Intuitive Logics — 8-step process: Focal Issue → Key Forces → Driving Forces → Rank by Importance/Uncertainty → Scenario Logics (2×2) → Flesh Out Scenarios → Implications → Early Indicators. Score: number of IL steps evidenced (1–5 scale).
Schwartz, P. (1991). The Art of the Long View. Currency/Doubleday. • Bradfield, R. et al. (2005). Calif. Mgmt. Rev. 48(1). • Shell (2013). Scenarios 40-Year Report.
Five Methodological Machineries: Representation, Consequential Decision-Making, Adjudication, Immersion, Bespoke Design. Academic framework assessing epistemological quality of wargames as knowledge-production instruments. Score: number of machineries robustly present (1–5).
Shappell (2024). The Methodological Machinery of Wargaming. International Studies Review 26(1). doi:10.1093/isr/viae002
Derived from After-Action Reports, government inquiries, FOI documents, and retrospective COVID-19 validation.
Degree to which exercise findings, recommendations, and AARs are publicly available. 1 = Classified/no disclosure, 2 = Partial leak/FOI, 3 = Summary published, 4 = Full AAR published, 5 = Full AAR + data + methodology published.
Reddin et al. (2021). Evaluating simulations as preparation for health crises. BMC Public Health. PMC8020603.
Percentage of exercise recommendations implemented before the next relevant real-world event. 1 = <10% implemented, 2 = 10–25%, 3 = 25–50%, 4 = 50–75%, 5 = >75% implemented.
UK COVID-19 Inquiry (2024). Module 1 Report. • US GAO Reports on pandemic preparedness gaps.
How accurately the exercise scenario predicted real-world outcomes (validated against COVID-19 and other events). 1 = No predictive value, 2 = Marginal, 3 = Moderate overlap, 4 = Strong prediction, 5 = Remarkably prescient.
Johns Hopkins CHS (2019). Event 201 Recommendations vs COVID-19 Reality. • UK Cygnus findings vs COVID-19 outcomes.
Speed of converting findings to policy/operational changes. 1 = Never implemented, 2 = >5 years, 3 = 2–5 years, 4 = 6–24 months, 5 = <6 months. Scored inversely: faster = higher.
FEMA (2020). HSEEP Improvement Planning Guide. • WHO (2017). Post-Exercise Action tracking.