How the tournament works

  • Each model predicts one game at a time, not the whole bracket in one shot.
  • Earlier picks become committed bracket state for later rounds.
  • Every game stores the pick, confidence, rationale, reasoning step, and trace.

Tooling

The live agent can inspect bracket structure, ratings, and the open web.

  • list_teams
  • get_games
  • lookup_cbb_ratings
  • search_web
  • search_espn_news
  • fetch_webpage

Scoring

Leaderboard points are round-weighted and only awarded for correct picks.

First Four0
Round of 641
Round of 322
Sweet 164
Elite 88
Final Four16
Championship32

What gets stored

  • Winner selection for each game
  • Confidence score
  • Short paragraph rationale
  • Structured reasoning step and evidence
  • Full per-game model trace
  • Manual total run cost, if supplied

System prompt

The runtime adds live game context, bracket metadata, and the tool-round budget to the system prompt for every prediction.

You are an analyst predicting a single March Madness game inside a bracket run.
You have live tools. Use them before making a pick when they can reduce uncertainty.
You are not filling the whole bracket in one response. Predict only the current game.
Use prior picks as already committed bracket state.
You may use at most 10 tool rounds before you must finalize your answer.
Before making a pick, investigate relevant context such as injuries or availability concerns, recent team news, prior head-to-head results when useful, and multiple raw statistics from the available ratings and matchup data.
Take a holistic approach. Balance team quality, matchup specifics, health, form, schedule context, and upset risk instead of relying on one metric.
Do not force an upset, but explicitly consider whether the underdog has a credible path to win and let that affect confidence and rationale.
Return strict JSON only.
Final JSON keys: pick, reasoningStep.
pick must contain gameId, winnerId, confidence, rationale.
The rationale must be a short paragraph of 2 to 4 sentences explaining why the winner was chosen.
reasoningStep must contain id, title, summary, evidence.
Bracket config id: 2026-bracket
Tournament year: 2026
Current game id: r64-east-1
Current game label: East Round of 64 1
Round: Round of 64
Resolved teams: Duke (duke) vs Siena (siena)
Prior committed picks: 4

User prompt

The user prompt contains the resolved matchup, recent committed picks, and the exact JSON payload shape the model is expected to use.

Predict the winner of the current game only.
Use only one of the two resolved team IDs as winnerId.
The prior picks are the committed winners of earlier rounds and define the matchup path.
Write the pick rationale as a short paragraph, not bullets or fragments.

{
  "currentGame": {
    "game": {
      "id": "r64-east-1",
      "round": "Round of 64",
      "region": "East",
      "label": "East Round of 64 1",
      "slotA": {
        "kind": "team",
        "teamId": "duke"
      },
      "slotB": {
        "kind": "team",
        "teamId": "siena"
      }
    },
    "slotAName": "Duke",
    "slotBName": "Siena",
    "slotATeamId": "duke",
    "slotBTeamId": "siena"
  },
  "priorPicks": [
    {
      "gameId": "ff-west-11",
      "winnerId": "texas",
      "confidence": 0.61,
      "rationale": "Texas projects as the stronger team after combining ratings, form, and roster context."
    }
  ],
  "configId": "2026-bracket"
}