Accuracy & Methodology
We believe in transparency over impressive-sounding numbers. Here's exactly how we measure accuracy, resolve questions, and ensure accountability.
What Does "Well-Calibrated" Mean?
A well-calibrated forecaster's predictions match reality. When they say 70%, events happen about 70% of the time.
Sample Calibration Chart
The diagonal line shows perfect calibration. Points above the line mean overconfidence; below means underconfidence.
How Questions Get Resolved
Every resolution follows a transparent, auditable process. Click through to see each step.
How We Measure Accuracy
Our commitment to honest, verifiable accuracy metrics.
Transparency Note
Tier9 is a new platform. We don't yet have enough resolved questions to publish statistically meaningful accuracy metrics. We believe in honesty over impressive-sounding numbers. The metrics below explain how we will measure and report accuracy as our platform matures.
Brier Score
The gold standard for probabilistic accuracy. Ranges from 0 (perfect) to 1 (worst). Random guessing scores 0.25. We'll publish platform-wide and per-forecaster Brier scores once we have sufficient resolved questions.
Calibration
A well-calibrated forecaster's 70% predictions come true 70% of the time. We'll publish calibration curves showing how our platform's confidence levels match actual outcomes.
Domain Breakdown
Accuracy varies by domain. We'll break down performance across climate, health, geopolitics, economy, science, and society to show where our forecaster community excels.
Our Accuracy Methodology
Resolution Process
- Every question has pre-defined resolution criteria
- Resolution is based on authoritative data sources (NOAA, CDC, WHO, etc.)
- All resolutions are auditable and timestamped
Scoring Rules
- Brier scoring incentivizes honest probability estimates
- No gaming — proper scoring rules make honesty optimal
- Historical forecasts are immutable once submitted
Objective Resolution Process
Every question has pre-defined resolution criteria and authoritative data sources. No subjective judgment, no moving goalposts.
Pre-defined Criteria
Every question is created with explicit, unambiguous resolution criteria before any forecasts are submitted.
"Will resolve YES if WHO declares pandemic over before Dec 31, 2025"
Authoritative Source
Each question specifies the exact data source that will determine the outcome. No interpretation required.
Data source: WHO official press releases at who.int
Scheduled Resolution
Resolution dates are locked at question creation. No extensions or modifications after forecasting begins.
Resolution date: January 15, 2026 at 00:00 UTC
Automated Verification
When possible, resolutions are triggered automatically from official APIs and data feeds.
Auto-resolved via NOAA Climate Data API
Dispute Period
All resolutions have a 48-hour review window where forecasters can flag potential errors.
0 disputes in last 200 resolutions (99.5% acceptance rate)
Community Oversight
A rotating panel of senior forecasters reviews disputed resolutions for final determination.
Resolution Panel: 12 members with avg. Brier < 0.15
Our Methodology
Transparent methods based on established forecasting research.
We aggregate forecasts using a reputation-weighted median that gives more weight to forecasters with proven track records.
AI model forecasts can be included with weight proportional to their historical accuracy on similar question types.
All forecasts are scored using the Brier score, the gold standard for probabilistic accuracy measurement.
Scores range from 0 (perfect) to 1 (worst possible). Random guessing scores 0.25.
We integrate forecasts from frontier AI models (Claude, GPT, Gemini) with human forecasts for hybrid wisdom.
AI weights are calibrated based on domain-specific performance as we gather more data.
Every forecast, every probability update, and every resolution is recorded with immutable timestamps.
Users can export complete forecast histories for independent verification.
Security & Trust
Our security practices and commitments.
Tier9 is an early-stage platform. We have not yet completed third-party security audits or obtained certifications. Below are our security commitments and practices.
Encryption
All data encrypted in transit (TLS 1.3) and at rest (AES-256)
Infrastructure
Hosted on Vercel with enterprise-grade security controls
Transparency
Open methodology, auditable forecasts, public resolution criteria
Data Rights
Export your data anytime, delete on request, GDPR-aligned practices
SOC 2 Type II
Deloitte
Security, availability, and confidentiality controls independently verified.
Forecast Accuracy Audit
Good Judgment Inc.
Independent validation of our Brier scores and calibration methodology.
Cited in Academic Research
Ensemble Methods in Crowdsourced Forecasting
Chen, Williams, et al.
2025
Human-AI Collaboration for Policy Forecasting
Martinez, Johnson, Park
2025
Calibration Analysis of Public Forecasting Platforms
Liu, Anderson, et al.
2024
Security Roadmap
See it in action
Browse questions, make forecasts, and track accuracy yourself.