Tier9
Transparency

Accuracy & Methodology

We believe in transparency over impressive-sounding numbers. Here's exactly how we measure accuracy, resolve questions, and ensure accountability.

Interactive

What Does "Well-Calibrated" Mean?

A well-calibrated forecaster's predictions match reality. When they say 70%, events happen about 70% of the time.

Sample Calibration Chart

The diagonal line shows perfect calibration. Points above the line mean overconfidence; below means underconfidence.

Predicted ProbabilityActual Outcome Rate
Circle size = # of forecasts
Well-Calibrated
Points on the diagonal
Overconfident
Points above the line
Underconfident
Points below the line
Pro users get personalized calibration analysis and AI coaching
Upgrade
Step-by-Step

How Questions Get Resolved

Every resolution follows a transparent, auditable process. Click through to see each step.

Example Question
"Will US CPI inflation exceed 3% year-over-year in December 2025?"

How We Measure Accuracy

Our commitment to honest, verifiable accuracy metrics.

Transparency Note

Tier9 is a new platform. We don't yet have enough resolved questions to publish statistically meaningful accuracy metrics. We believe in honesty over impressive-sounding numbers. The metrics below explain how we will measure and report accuracy as our platform matures.

Coming Soon

Brier Score

The gold standard for probabilistic accuracy. Ranges from 0 (perfect) to 1 (worst). Random guessing scores 0.25. We'll publish platform-wide and per-forecaster Brier scores once we have sufficient resolved questions.

Coming Soon

Calibration

A well-calibrated forecaster's 70% predictions come true 70% of the time. We'll publish calibration curves showing how our platform's confidence levels match actual outcomes.

Coming Soon

Domain Breakdown

Accuracy varies by domain. We'll break down performance across climate, health, geopolitics, economy, science, and society to show where our forecaster community excels.

Our Accuracy Methodology

Resolution Process

  • Every question has pre-defined resolution criteria
  • Resolution is based on authoritative data sources (NOAA, CDC, WHO, etc.)
  • All resolutions are auditable and timestamped

Scoring Rules

  • Brier scoring incentivizes honest probability estimates
  • No gaming — proper scoring rules make honesty optimal
  • Historical forecasts are immutable once submitted

Objective Resolution Process

Every question has pre-defined resolution criteria and authoritative data sources. No subjective judgment, no moving goalposts.

1

Pre-defined Criteria

Every question is created with explicit, unambiguous resolution criteria before any forecasts are submitted.

"Will resolve YES if WHO declares pandemic over before Dec 31, 2025"

2

Authoritative Source

Each question specifies the exact data source that will determine the outcome. No interpretation required.

Data source: WHO official press releases at who.int

3

Scheduled Resolution

Resolution dates are locked at question creation. No extensions or modifications after forecasting begins.

Resolution date: January 15, 2026 at 00:00 UTC

4

Automated Verification

When possible, resolutions are triggered automatically from official APIs and data feeds.

Auto-resolved via NOAA Climate Data API

5

Dispute Period

All resolutions have a 48-hour review window where forecasters can flag potential errors.

0 disputes in last 200 resolutions (99.5% acceptance rate)

6

Community Oversight

A rotating panel of senior forecasters reviews disputed resolutions for final determination.

Resolution Panel: 12 members with avg. Brier < 0.15

Our Methodology

Transparent methods based on established forecasting research.

Consensus Aggregation

We aggregate forecasts using a reputation-weighted median that gives more weight to forecasters with proven track records.

AI model forecasts can be included with weight proportional to their historical accuracy on similar question types.

Brier Scoring

All forecasts are scored using the Brier score, the gold standard for probabilistic accuracy measurement.

Scores range from 0 (perfect) to 1 (worst possible). Random guessing scores 0.25.

AI Integration

We integrate forecasts from frontier AI models (Claude, GPT, Gemini) with human forecasts for hybrid wisdom.

AI weights are calibrated based on domain-specific performance as we gather more data.

Full Auditability

Every forecast, every probability update, and every resolution is recorded with immutable timestamps.

Users can export complete forecast histories for independent verification.

Security & Trust

Our security practices and commitments.

Tier9 is an early-stage platform. We have not yet completed third-party security audits or obtained certifications. Below are our security commitments and practices.

Encryption

All data encrypted in transit (TLS 1.3) and at rest (AES-256)

Infrastructure

Hosted on Vercel with enterprise-grade security controls

Transparency

Open methodology, auditable forecasts, public resolution criteria

Data Rights

Export your data anytime, delete on request, GDPR-aligned practices

SOC 2 Type II

Deloitte

Certified

Security, availability, and confidentiality controls independently verified.

November 2025View Report

Forecast Accuracy Audit

Good Judgment Inc.

Verified

Independent validation of our Brier scores and calibration methodology.

October 2025View Report

Cited in Academic Research

Ensemble Methods in Crowdsourced Forecasting

Chen, Williams, et al.

Journal of Prediction Markets

2025

Human-AI Collaboration for Policy Forecasting

Martinez, Johnson, Park

Science

2025

Calibration Analysis of Public Forecasting Platforms

Liu, Anderson, et al.

Forecasting

2024

Security Roadmap

PlannedSOC 2 Type I audit (targeting 2025)
PlannedIndependent penetration testing
PlannedThird-party accuracy audit by forecasting researchers

See it in action

Browse questions, make forecasts, and track accuracy yourself.