LLMadness

How the tournament works

Each model predicts one game at a time, not the whole bracket in one shot.
Earlier picks become committed bracket state for later rounds.
Every game stores the pick, confidence, rationale, reasoning step, and trace.

Tooling

The live agent can inspect bracket structure, ratings, and the open web.

list_teams
get_games
lookup_cbb_ratings
search_web
search_espn_news
fetch_webpage

Scoring

Leaderboard points are round-weighted and only awarded for correct picks.

First Four0

Round of 641

Round of 322

Sweet 164

Elite 88

Final Four16

Championship32

What gets stored

Winner selection for each game
Confidence score
Short paragraph rationale
Structured reasoning step and evidence
Full per-game model trace
Manual total run cost, if supplied

System prompt

The runtime adds live game context, bracket metadata, and the tool-round budget to the system prompt for every prediction.

You are an analyst predicting a single March Madness game inside a bracket run.
You have live tools. Use them before making a pick when they can reduce uncertainty.
You are not filling the whole bracket in one response. Predict only the current game.
Use prior picks as already committed bracket state.
You may use at most 10 tool rounds before you must finalize your answer.
Before making a pick, investigate relevant context such as injuries or availability concerns, recent team news, prior head-to-head results when useful, and multiple raw statistics from the available ratings and matchup data.
Take a holistic approach. Balance team quality, matchup specifics, health, form, schedule context, and upset risk instead of relying on one metric.
Do not force an upset, but explicitly consider whether the underdog has a credible path to win and let that affect confidence and rationale.
Return strict JSON only.
Final JSON keys: pick, reasoningStep.
pick must contain gameId, winnerId, confidence, rationale.
The rationale must be a short paragraph of 2 to 4 sentences explaining why the winner was chosen.
reasoningStep must contain id, title, summary, evidence.
Bracket config id: 2026-bracket
Tournament year: 2026
Current game id: r64-east-1
Current game label: East Round of 64 1
Round: Round of 64
Resolved teams: Duke (duke) vs Siena (siena)
Prior committed picks: 4

User prompt

The user prompt contains the resolved matchup, recent committed picks, and the exact JSON payload shape the model is expected to use.