CONVEX
Methodology Paper 04 / v1.0

NVI: Convex Narrative Velocity Index

A three-channel composite measuring the acceleration of financial narratives across 46 editorially diverse RSS feeds, published on a 0–100 scale with convergence-state flags and explicit warm-up handling.

Last reviewed: · Version 1.0 · Formulas on this page are authoritative: any live NVI reading on the Convex site is computed from the definitions below.

1. Abstract

The NVI is a 0–100 composite indicator of narrative acceleration. It synthesizes three independent channels: (1) term-frequency velocity, the 7-day rate of change in normalised mention counts across a curated vocabulary of macro terms; (2) sentiment-divergence convergence, a measure of agreement across seven editorial categories covering the same topics on the same day; and (3) source-primacy authority, a rolling measure of which outlets originate narratives that subsequently propagate. The composite weights the three channels 40%, 35%, and 25%, respectively, and publishes with a 14-day warm-up during which readings are flagged preliminary.

The design premise is that narrative acceleration contains information about short-horizon price action that is not present in the price series itself. Text-based indices in the existing literature typically measure sentiment level; NVI measures the derivative of attention plus the cross-source agreement structure, and distinguishes a single outlet breaking a story from a narrative propagating across ideologically distant outlets. The latter configuration historically coincides with material price action in the one-to-five trading-day window. NVI is intended as a real-time monitoring gauge and a feature for downstream models, not as a calibrated probability of any specific market outcome.

2. Motivation

Shiller (2019) argues that economic fluctuations are driven not only by fundamentals but by the spread of narratives, and that the velocity of narrative propagation is itself an economic variable. Tetlock (2007) demonstrated that pessimistic content in Wall Street Journal columns predicts short-horizon equity returns and reversal. Baker, Bloom, and Davis (2016) constructed the Economic Policy Uncertainty index from a fixed keyword rule applied to ten newspapers, and showed that EPU leads investment and employment. The research-agenda thread running through these papers is that unstructured text carries pricing-relevant information that structured data misses.

Most operational implementations of this thread, however, measure level: how pessimistic is today’s coverage, how high is today’s EPU value, how bearish is today’s Twitter sample. Level is a useful signal but it is intrinsically lagged, because the media catches up to events rather than the reverse. The useful quantity for a trader is acceleration: how fast is the coverage of topic X growing, and how fast is the distribution of opinion across outlets collapsing toward consensus.

The NVI is built around this distinction. It measures three derivatives simultaneously: the rate of change in attention (channel 1), the rate of collapse in cross-source disagreement (channel 2), and the structural shift in which outlets lead versus follow (channel 3). A high NVI reading with converged bearish sentiment and high authority dispersion means many outlets across ideologies have independently arrived at the same negative story, and this configuration is far more predictive of near-term volatility than any single outlet’s pessimism.

3. Channel definitions

3.1 Term-frequency velocity (40% weight)

The channel tracks a curated vocabulary of ~80 macro-relevant terms across a 7-day rolling window. For each term, the system computes normalised mention counts per day (mentions divided by total articles that day, to control for corpus-volume swings around holidays and weekends) and then the 7-day rate of change in that normalised count.

The vocabulary is split into four semantic buckets: macro policy (e.g. “rate cut”, “quantitative tightening”, “dot plot”), risk (“recession”, “credit crunch”, “liquidity”), markets (“rotation”, “drawdown”, “melt-up”), and crypto (“stablecoin”, “liquidation cascade”, “hash rate”). Bucket-level acceleration is computed as the mean of term-level accelerations within the bucket, clipped at the 95th percentile to prevent a single viral term from dominating.

The channel score is the maximum of the four bucket-level accelerations, scaled to a 0–20-point range. A bucket acceleration of 0% (flat coverage) maps to 0 points; an acceleration of +60% (a historically strong spike) maps to 20 points. This maximum design means a breakout in any single domain (say, a sudden spike in “liquidity” coverage) can move the composite even if the other three buckets remain quiet.

bucket_accel(b) = mean_term_in_b( clip( velocity_term, p95 ) )
channel_1_raw = max_b( bucket_accel(b) )
channel_1_score = clamp( channel_1_raw / 0.60 × 20, 0, 20 )

3.2 Sentiment-divergence convergence (35% weight)

This channel measures whether ideologically diverse outlets are arriving at the same outlook. The 46 RSS feeds are hand-classified into seven categories: establishment financial press, contrarian outlets, wire services, government and central-bank communications, academic and think-tank commentary, crypto-native outlets, and editorially neutral aggregators. For each day, sentiment is scored on a 0–1 bullish scale per category, using the same classifier pass that already runs on each article during ingest (piggy-backing on existing evaluation infrastructure rather than adding new API calls).

Convergence is the inverse of cross-category variance. Let s_cbe the category-level sentiment on day t. The raw convergence metric is 1 − var(s_c), rescaled so that full cross-category agreement (variance 0) maps to 20 points and maximum disagreement (variance 0.25, the theoretical ceiling for a 0–1 variable with balanced bullish and bearish categories) maps to 0 points.

A separate flag distinguishes three convergence states: Converged Bullish (all categories > 0.6 sentiment), Converged Bearish (all categories < 0.4), and Diverged(cross-category variance > 0.18). The flag does not enter the numerical composite; it is surfaced alongside the NVI reading so a user can distinguish a high NVI driven by unanimous agreement from a high NVI driven by rapid but contested narrative churn.

convergence = 1 − var( s_c ) where c in {7 categories}
channel_2_score = clamp( convergence × 20, 0, 20 )

3.3 Source-primacy authority (25% weight)

The third channel tracks which outlets break stories first and how widely those stories propagate. For each term that accelerates past a configurable threshold on day t, the system identifies the first outlet to mention the term in the accelerating context (the “primacy outlet”). A rolling 30-day window tracks each outlet’s share of primacy, weighted by the subsequent propagation depth (number of distinct outlets that picked up the story within 48 hours).

The channel score rises when three conditions hold simultaneously: (a) primacy is concentrated in wire services rather than opinion outlets (signals a fast-moving factual story rather than a narrative frame), (b) propagation is rapid (within-48-hour pickup by at least five distinct outlets), and (c) at least three of the seven editorial categories carry the story. This triple condition discriminates between genuine macro surprises, which produce rapid cross-category propagation, and editorial takes, which tend to stay within a single category even when widely shared within it.

The channel raw signal is the number of stories meeting the triple condition on day t. Zero stories maps to 0 points; five or more stories maps to 20 points. In practice, a score above 15 is rare outside of policy days (FOMC, CPI, major geopolitical breaks).

eligible_story = ( wire_primacy ∧ propagation_48h ≥ 5 ∧ category_spread ≥ 3 )
channel_3_raw = count_stories( eligible_story, day = t )
channel_3_score = clamp( channel_3_raw / 5 × 20, 0, 20 )

4. Composite formula and normalisation

The composite is a weighted sum of the three channel scores, rescaled to a 0–100 range:

NVI = ( 0.40 × channel_1_score + 0.35 × channel_2_score + 0.25 × channel_3_score ) / 20 × 100

The weights reflect the relative empirical informativeness of the three channels as measured on a held-out 2023–2024 validation window. Term velocity carries the largest weight because it is the most responsive to new information and has the richest sub-structure (80+ terms across four semantic buckets). Sentiment convergence carries the middle weight because it is slower moving but provides orthogonal information about cross-source agreement. Source primacy carries the smallest weight because it is the noisiest and most easily gamed by a single outlet’s publishing cadence.

NVI recomputes every 30 minutes during US market hours and every 2 hours overnight, driven by the RSS ingest schedule. Readings during the 14-day warm-up are flagged preliminary and suppressed from public display because the rolling windows underlying the three channels require at least 7 days of history for channel 1, 14 days for channel 2’s variance estimate, and 30 days for channel 3’s primacy window.

5. Interpretation thresholds

RangeLabelInterpretation
0 – 20QuietVery low narrative acceleration. Trend-continuation regimes.
20 – 40StableNormal background level of narrative change. Most trading days sit here.
40 – 60ActiveOne or two channels elevated. Specific terms gaining traction.
60 – 80AcceleratingMultiple channels hot. Narrative shift often precedes market moves within days.
80 – 100SurgingCross-channel extreme. Historically coincident with major regime shifts.

These thresholds are descriptive labels drawn from the empirical distribution of NVI readings during the 2023–2025 sample, not calibrated probabilities. A reading of 70 should be read as “narrative acceleration is in the top decile of recent history,” not as a probability of a specific market outcome. The convergence flag (Converged Bullish / Converged Bearish / Diverged) materially changes the interpretation of any given NVI level; readers should consult both.

6. Known limitations

Corpus selection bias. RSS coverage is not universal. Outlets that do not publish via RSS (some paywalled analysis, social-first outlets, closed Telegram and Substack channels) are absent from the corpus even when they are influential in the narrative propagation network. The feed-selection decision prioritises editorial diversity over share-of-voice, which means the NVI may miss a surge that is concentrated entirely in non-RSS channels. Feed inclusion criteria are documented in the NVI code; sensitivity to feed inclusion is tested by removing categories one at a time and recomputing the 2024 readings.

Vocabulary bias. The ~80-term macro vocabulary reflects the editorial priors of its authors. A new narrative built on vocabulary that is not in the dictionary will not accelerate channel 1 at all. The current vocabulary was last reviewed in March 2026 and is scheduled for quarterly review; an expansion candidate list is tracked in the project repository.

Sentiment-classifier error. Channel 2 depends on the accuracy of the sentiment classifier applied to individual articles. The classifier is calibrated on a hand-labelled set of 1200 macro articles with an out-of-sample accuracy of 0.78 on three-way (bullish / neutral / bearish) classification. The confusion matrix over-counts neutral-as-bearish in crypto-native outlets because those outlets use edgier language, a failure mode that can depress channel 2 convergence during crypto-heavy news weeks. A category-conditional classifier is on the roadmap.

Primacy attribution is noisy. Channel 3 attributes origination based on publish timestamps, which are unreliable because outlets backdate and update articles. A two-hour grace window is applied, but close-call primacy attributions still produce noise. The channel’s 25% weight reflects this known noise.

Link-rot and archival gaps. RSS feeds are not archival. Articles are retracted, URLs change, feeds are reconfigured. Convex snapshots each article at ingest with a hash and stores the extracted text independently of the source URL, but historical NVI readings cannot be reconstructed perfectly from today’s live feeds alone.

Not a causal explanation. NVI is a predictive feature. Correlation between NVI spikes and subsequent realised volatility is a phenomenon; the paper makes no claim about whether the narrative shift causes the price move or both respond to an underlying event. Empirical validation reports forecast skill, not causal identification.

7. Data sources and update cadence

  • Corpus: 46 RSS feeds across seven editorial categories; complete feed list maintained in the project repository
  • Vocabulary: ~80 macro terms across four semantic buckets; reviewed quarterly
  • Sentiment classifier: Piggybacks on existing article-evaluation pass; no additional API cost
  • Update cadence: every 30 minutes during US market hours, every 2 hours overnight
  • Warm-up requirement: 14 days (readings flagged preliminary during warm-up)

The channel-level breakdown and convergence-state flag are published alongside the composite at /indicators/nvi.

8. References

  1. Shiller, R. J. (2019). Narrative Economics: How Stories Go Viral and Drive Major Economic Events. Princeton University Press.
  2. Baker, S. R., Bloom, N., and Davis, S. J. (2016). “Measuring Economic Policy Uncertainty.” Quarterly Journal of Economics, 131(4), 1593–1636.
  3. Tetlock, P. C. (2007). “Giving Content to Investor Sentiment: The Role of Media in the Stock Market.” Journal of Finance, 62(3), 1139–1168.
  4. Ke, Z. T., Kelly, B. T., and Xiu, D. (2019). “Predicting Returns with Text Data.” University of Chicago Working Paper.
  5. Loughran, T. and McDonald, B. (2011). “When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks.” Journal of Finance.
  6. Caldara, D. and Iacoviello, M. (2022). “Measuring Geopolitical Risk.” American Economic Review, 112(4), 1194–1225.