Scenario Methodology
How Convex computes and updates scenario probabilities. Full transparency on our quantitative framework, its parameters, and its known limitations.
Why this isn’t another static scenario PDF. Convex scenarios update every 6 hours, discover new scenarios autonomously across 6 data sources, trace causal chains to confirm signals, manage their own lifecycle (emergence → watchlist → active → resolution), and self-calibrate over time. An LLM and a Bayesian engine run in parallel and check each other.
Read the full explainer on the Scenario Engine page →Probability Framework
Each scenario starts with a calibrated base rate probabilityderived from a specific historical reference class (e.g., “quarters with CPI >4% AND unemployment >5% since 1970”). This is not a guess — it’s the empirical frequency of similar conditions in the historical record, adjusted for current structural differences.
Probabilities are updated using Bayesian log-odds updating. New evidence shifts the log-odds of the probability proportional to a calibration constant. This ensures updates are:
- Bounded between 5% and 95% (no false certainty)
- Symmetric in log-space (rising from 10% to 20% requires the same evidence strength as rising from 80% to 90%)
- Time-decaying toward the base rate when evidence is absent (half-life parameterised)
Time decay: Between updates, the probability decays toward the calibrated base rate. The decay follows an exponential half-life — after one half-life period with no new evidence, the probability moves halfway back toward the base rate. This prevents stale probabilities from persisting.
Coherence enforcement: After individual scenario probabilities are computed, a simultaneous iterative pass enforces structural constraints between related scenarios. For example, mutually exclusive scenarios (Stagflation + Fed Pivot) cannot exceed their joint probability ceiling. This is order-independent — the same result regardless of which scenario is processed first.
Evidence Model
Each scenario monitors 5 key indicators. Evidence strength is computed via z-scores on first-differences (changes, not levels) — this captures how unusual the recent movement is relative to the rolling 252-day volatility of changes. First-differences prevent spurious correlation from trending series.
Z-scores are weighted by per-metric importance weights and adjusted for cross-metric correlation using an effective degrees-of-freedom correction. If all 5 metrics are highly correlated, they collectively carry less information than 5 independent signals.
Regime-conditional weights: Some metrics have defined threshold levels where their economic interpretation changes (e.g., unemployment below 4% vs. above 6%). When the metric is in a specific regime, its weight is adjusted accordingly.
The evidence model requires at least 60 data points per metric for z-score computation. Metrics with fewer observations fall back to a threshold-proximity proxy.
Macro Regime Integration
The platform’s full macro data estate — 150+ indicators across rates, credit, commodities, currencies, and equities — feeds a regime classification that modulates scenario sensitivity.
Each scenario has a polarity(escalatory, benign, or neutral). During risk-off macro regimes, escalatory scenarios receive amplified evidence sensitivity while benign scenarios are dampened. This prevents the model from assigning high probability to “calm” scenarios during obvious stress periods.
Update frequency: Scenario articles are generated every 6 hours via automated pipeline, with cadence adjusted by heat classification. CRITICAL scenarios may update daily; COLD scenarios may go weeks between updates. Manual editorial triggers are available when events warrant immediate analysis.
Scenario Lifecycle
Scenarios have a full lifecycle managed by four independent automated pipelines:
- Scenario Radar (every 6 hours): Scans 6 independent data source categories (FRED surprises, price moves, news clusters, CFTC positioning, composite indices, cross-source divergences) for emerging macro configurations. Regime-aware thresholds adjust for VIX levels.
- Article Generation (every 6 hours, offset +1h from radar): Runs the full 22-gate pipeline for active-tier scenarios. Internal heat-based cadence gates prevent over-publication.
- Lifecycle Evaluation (daily): Checks demotion criteria (probability below 5% for 3 consecutive cycles, or COLD heat for 30+ days), evaluates structured resolution conditions against live data, and cleans up stale watchlist entries and emergence signals.
- Coherence Audit (every 2 hours): Verifies probability coherence across mutually exclusive pairs, sum constraints, and conditional relationship consistency. Flags violations without auto-correcting.
Tiered tracking: Scenarios exist in either active tier (full computation, article generation) or watchlist tier (reduced monitoring every 48 hours, no article generation). The system maintains a maximum of 8 active + 5 watchlist scenarios to prevent analysis dilution.
Resolution conditions are structured and machine-checkable: each condition specifies a metric, comparison operator, threshold, and optional sustained-days requirement. The system checks these deterministically against live data — no LLM in the loop for resolution checking. When all conditions are met, the scenario is flagged for human review rather than auto-retired.
Full technical documentation of the radar, cascade templates, emergence gates, and lifecycle transitions is available on the Scenario Engine methodology page.
Parameter Transparency
All tunable parameters are stored in a database configuration table, not hardcoded. Every change is logged with a justification. Current values:
| Parameter | Value | Last Changed |
|---|---|---|
| Calibration Constant | 0.30 | Initial |
| Half-Life (days) | 90 | Initial |
| Heat Weight: Evidence | 0.40 | Initial |
| Heat Weight: Threshold | 0.35 | Initial |
| Heat Weight: News | 0.25 | Initial |
| News Saturation Point | 10 | Initial |
Known Limitations
Every model has limitations. We document ours explicitly so readers can judge the confidence they place in our probability estimates.
| # | Limitation | Severity | Mitigation |
|---|---|---|---|
| L1 | Each scenario monitors only 5 indicators; many relevant signals not captured | HIGH | Macro regime backdrop provides ambient signal from full data estate; CFTC/prediction market enrichment adds market-implied context |
| L2 | Cross-metric correlation adjustment uses simplified effective-DoF, not full eigenvalue decomposition | MEDIUM | Defensible for 5-metric scenarios; full spectral approach deferred as marginal gain for current scale |
| L3 | Coherence enforcement is pairwise iterative, not joint-probability-space optimization | MEDIUM | Converges to order-independent solution; full joint optimization deferred for current scenario count |
| L4 | Model has limited ability to capture non-linear regime transitions (e.g., liquidity crises where correlations go to 1) | MEDIUM | Hamilton 2-state regime switching model detects regime transitions; DFM health monitors flag correlation breakdowns; human editorial oversight for extreme events |
| L5 | Position expected values assume binary outcomes, ignore path dependency, slippage, time decay | MEDIUM | Clearly labelled as directional heuristic; annualised return enables cross-horizon comparison |
| L6 | Base rates for Fiscal Dominance (n=3) and Trade War (n=8 years) have small reference class samples | HIGH | Documented uncertainty; market-implied comparison provides external anchor; quarterly review required |
| L7 | Model cannot anticipate genuinely novel events (black swans) with no historical reference class | HIGH | Inherent limitation of any model; editorial layer + manual trigger provides human backstop |
| L8 | Z-scores computed against currently-available FRED values, not original-release vintage | MEDIUM | ALFRED vintage API integration planned; does not affect live probability accuracy, only backtest fidelity |
| L9 | Conditional and reinforcing relationship types inform prompt only, not algebraically auto-adjusted | LOW | Conservative: avoids conditional probability algebra complexity; automated coupling deferred |
| L10 | Prediction market comparison available only for scenarios with explicit polymarketSlug mapping | LOW | Graceful degradation: marketImpliedProb = null when unavailable |
| L11 | Bilateral stress tensor coverage depends on RSS + GDELT data availability for each country pair | LOW | CAMEO channel scoring, velocity/acceleration alerts, and 90-day historical scaling are fully operational; enrichment gracefully skipped for pairs with insufficient data |
| L12 | Scenario Radar emergence detection relies on LLM novelty assessment (Gate 2) which may have inconsistent judgements across runs | MEDIUM | Jaccard similarity fallback on tags when LLM fails; temporal gate (signal must persist across multiple radar runs) filters transient noise; human review required before active promotion |
| L13 | Cascade templates are manually defined — novel transmission mechanisms not matching any template are not tracked | LOW | Templates cover the 7 most common macro transmission pathways; novel cascades still detected via multi-source signal clustering (Gate 1) even without template matching |
| L14 | Qualitative resolution conditions (e.g., "BOJ reaffirms ZIRP") cannot be automatically checked against data | LOW | Only structured quantitative conditions (metric + operator + threshold) are automated; qualitative conditions require manual resolution via editorial trigger |
Calibration History
When scenarios are resolved, we record the outcome against the assigned probability. Over time, this creates a calibration curve — if our model assigns 30% probability, roughly 30% of those scenarios should materialise. This section will populate as scenarios are resolved.
| Probability Bin | Total | Materialised | Hit Rate | Assessment |
|---|---|---|---|---|
| 0–20% | 0 | 0 | — | — |
| 20–40% | 0 | 0 | — | — |
| 40–60% | 0 | 0 | — | — |
| 60–80% | 0 | 0 | — | — |
| 80–100% | 0 | 0 | — | — |
No scenarios have been resolved yet. Calibration data will appear here once scenarios are resolved with outcomes.
Expected Value Disclaimer
Expected value computations in scenario articles assume binary scenario resolution and ignore path dependency, execution slippage, and time decay. They are directional indicators, not P&L forecasts. Annualised returns enable cross-horizon comparison but do not account for compounding, margin requirements, or drawdown risk. Treat them as a framework for comparing relative opportunity across scenarios, not as investment targets.
This methodology is a living document. Parameters, limitations, and calibration data are updated as the model evolves. Last methodology review: April 2026. Questions or concerns about our framework can be directed to mail@convextrade.com.