How We Track Prediction Accuracy with Brier Scores
Every intelligence platform makes forecasts. We actually track whether ours come true — using the same calibration methods as professional forecasting tournaments.
The Accountability Gap
Most intelligence platforms make predictions constantly. "Oil will rise." "The Fed will cut rates." "Tensions will escalate." But almost none of them track whether those predictions came true.
This creates a perverse incentive: make bold predictions that attract attention, then quietly move on when they're wrong. The reader has no way to evaluate which sources are actually reliable.
How Brier Scoring Works
VORENTH uses Brier scores to measure prediction accuracy. The formula is simple:
Brier Score = (probability - outcome)²
Where outcome is 1 (correct), 0.5 (partial), or 0 (incorrect).
A perfect Brier score is 0.00 (you assigned 100% probability and it happened). The worst possible score is 1.00 (you assigned 100% probability and it didn't happen). Random guessing averages around 0.25.
What Makes Our System Different
1. Forecast vs. Signal Separation
VORENTH distinguishes between scored forecasts (discrete, falsifiable claims tracked against outcomes) and analytical signals (directional intelligence that informs but isn't scored). Only forecasts appear in the accuracy record — this prevents vague directional calls from polluting calibration data.
2. Automatic Extraction
Every intelligence report generates 1-3 high-quality forecasts and 1-4 analytical signals. Each forecast must be specific, time-bounded, and independently verifiable — no vague claims like "markets will react."
3. Market-Calibrated Probabilities
Forecast probabilities are anchored against live prediction market data (Polymarket). If the market says 60% and VORENTH says 75%, the system requires specific evidence to justify the deviation. This prevents overconfidence and probability clustering.
4. Category-Aware Confidence
The system adjusts confidence based on where news-based analysis has genuine predictive edge. Geopolitical and policy predictions get stronger conviction. Market price predictions are treated more conservatively — because quant models generally beat narrative analysis on specific price targets.
5. Evidence-Based Resolution
When a forecast reaches its target date, our system fetches real market data from Polygon and news from GDELT, then uses AI to adjudicate the outcome based solely on evidence — not general knowledge.
6. Calibration Feedback Loop
Historical accuracy data feeds back into the system. If we've been overconfident in a category (predicting 70% when outcomes occur 50% of the time), future probabilities are automatically adjusted downward using base rate anchoring.
7. Public Track Record
Our track record page shows aggregate forecast accuracy — calibration curves, Brier scores by category, and individual resolved forecasts. Full transparency.
Why This Matters
When you're making decisions based on intelligence analysis — whether it's portfolio allocation, policy recommendations, or risk assessment — you need to know how reliable the source is.
A system that tracks its own accuracy and self-corrects isn't just more honest. Over time, it becomes genuinely better at forecasting.
Ready to see it in action?
3 free intelligence queries per day. No credit card required.
Start analyzing