The Model vs. March Madness: The Bracket Breaks

The machine filled out my bracket. It survived Round 1 with its dignity mostly intact. It made Sweet 16 predictions with confidence. Then the Elite Eight happened, and the bracket didn't just break. It shattered on a three-pointer with less than a second left.

The Sweet 16: High Confidence Holds, Everything Else Crumbles

The model went 5 for 8 in the Sweet 16, a 62.5% clip. The pattern was clean and humbling: every high-confidence pick hit, and every mid-range pick missed.

Duke over St. John's (83.1% confidence): Hit. Duke trailed by 10 in the second half, then Caleb Foster, playing on a broken foot he suffered 20 days earlier, scored 11 and Isaiah Evans hit a step-back three that put them ahead for good. Cameron Boozer went 22 and 10. The model doesn't know about broken feet. It just knew Duke was better.

UConn over Michigan State (62.1%): Hit. UConn blew a 19-point lead, let Michigan State briefly take it back, then closed it out with clutch free throws. The model picked this one right, but the game itself was a heart attack. My gut pick for the round, Michigan State, was wrong. Izzo in March isn't what it used to be. Or maybe Hurley in March is what it's becoming.

Purdue over Texas (78%): Hit. Trey Kaufman-Renn's tip-in with 0.7 seconds left ended Texas's incredible run from the play-in game. Texas lost their fourth game of the tournament on a shot in the final two seconds. That's the most in tournament history. The Cinderella slipper didn't just not fit, it got stepped on repeatedly at the buzzer.

Arizona over Arkansas (74%): Hit. Arizona shot 64% and had six players score 14 or more, a tournament first. The model said Arizona was the most complete team in the West. Arizona agreed, 109-88.

Michigan over Alabama (70%): Hit. Yaxel Lendeborg went 23-12-7 and Michigan hit 13 threes. Alabama's Labaron Philon Jr. put up 35 in a loss. Sometimes the better team is just the better team.

Then the misses.

Iowa over Nebraska: The model had Nebraska at 63%. Iowa trailed for 37 of 40 minutes, then Bennett Stirtz hit a go-ahead three with 2:10 left. The 9-seed Cinderella that killed Florida kept dancing. Nebraska's best season ever (28-7) ended against their rival. The model saw defensive efficiency. Iowa saw a team with nothing to lose.

Illinois over Houston: The model had Houston at 63.2%. Illinois held Houston to 34% shooting and two free throw attempts. Two. In a tournament game. David Mirkovic and Keaton Wagler both posted double-doubles. Houston was the model's contrarian darling all tournament, ranked above their seed line, given 14.8% championship odds. Illinois turned them into a pumpkin.

Tennessee over Iowa State: The model's least confident pick at 61%, and the one I flagged as most likely to flip. Tennessee dominated the paint 43-22 and sent Iowa State home without Joshua Jefferson, who left with an injury. Tennessee at 0.2% championship odds before the tournament, I wrote in the last post. That number's climbing. It climbed all the way to the Elite Eight.

The pattern is worth noting: the model's three misses were at 61%, 63%, and 63.2% confidence. Its five hits were at 62.1%, 70%, 74%, 78%, and 83.1%. The model essentially told us which picks it wasn't sure about, and those were exactly the ones that went wrong. A model that knows what it doesn't know is more useful than one that's always confident.

The Elite Eight: Where the Bracket Died

Four games. Two regions the model called perfectly. Two it didn't survive.

Arizona 79, Purdue 64. The freshman trio of Koa Peat (20), Kharchenkov (18), and Brayden Burries (14) combined for 52 points. Arizona hadn't been to the Final Four since 2001, a 25-year drought that ended emphatically. The model had Arizona winning the West all along. One of the few things it got completely right from day one.

Illinois 71, Iowa 59. Iowa led 12-2 early and the Cinderella story looked like it had one more chapter. Then Illinois scored 43 of their 71 points in the second half. Keaton Wagler put up 25 and earned Regional Most Outstanding Player. Illinois's first Final Four since 2005. The model had Houston winning this region. Houston watched from home.

Michigan 95, Tennessee 62. This wasn't a game. It was a statement. Yaxel Lendeborg scored 27, Elliot Cadeau dished 10 assists, and Michigan ran Tennessee out of the gym by 33. Tennessee's Cinderella run, from 0.2% championship odds to the Elite Eight, ended against a team that was simply operating on a different level. The model had Michigan winning the Midwest, and Michigan made it look easy.

UConn 73, Duke 72. This is where the bracket broke.

Duke led 44-29 at halftime. The overall number one seed, 35-2 on the season, the model's championship pick at 28.1%, was 20 minutes from the Final Four with a 15-point cushion. The model would have given Duke something north of 95% at that point.

UConn cut it to 72-69. Duke still had the ball, still had the lead, still had the game in hand. Then Braylon Mullins caught the ball beyond the arc with under a second remaining and buried a three that will be replayed until the next tournament starts. Duke's season, and the model's championship prediction, died on a shot that no efficiency metric, no Monte Carlo simulation, and no gradient boosting classifier could ever see coming.

The model had Duke winning the whole thing. Duke winning the East at 76.5% confidence. Duke beating Houston in the Final Four. Duke beating Michigan in the championship. Every downstream prediction in the bracket ran through Durham, and now it runs through Storrs instead.

The Final Four

The model predicted Duke, Houston, Arizona, and Michigan.

The actual Final Four is UConn, Illinois, Arizona, and Michigan.

Two for four. The model got the West (Arizona) and Midwest (Michigan) right. It missed the East (UConn instead of Duke) and the South (Illinois instead of Houston). Both misses were teams the model ranked in the bottom half of its championship odds: UConn at 6.2%, Illinois at 3.8%.

Here's the thing, though. The model's predicted championship game was Duke vs. Michigan. Michigan is still alive. Half the final is intact. The other half now features UConn, whose coach Dan Hurley is making his third Final Four in four years, and Illinois, whose defensive performance against Houston was the most dominant single game of the tournament.

The matchups, Saturday April 4 at Lucas Oil Stadium in Indianapolis:

UConn vs. Illinois, 6:00 PM ET. Two teams that weren't supposed to be here according to the model, the oddsmakers, or most bracket pools. UConn got here on a buzzer-beater. Illinois got here by suffocating two opponents that were supposed to be better.

Arizona vs. Michigan, 8:30 PM ET. Two 1-seeds. Two teams the model has believed in all tournament. Arizona has the freshmen. Michigan has the most dominant offensive performance of any team left. This is the game the model would have wanted, just one round later than planned.

Championship: Monday, April 6, 8:30 PM ET.

  
The Cumulative Scorecard

Round
Record
Accuracy

Round of 64
22/32
68.8%

Round of 32
11/16
68.8%

Sweet 16
5/8
62.5%

Elite Eight
2/4
50.0%

Total
40/60
66.7%

Round	Record	Accuracy
Round of 64	22/32	68.8%
Round of 32	11/16	68.8%
Sweet 16	5/8	62.5%
Elite Eight	2/4	50.0%
Total	40/60	66.7%

The model started at 68.8% and has been sliding ever since. That's not surprising. Early rounds have more games and more predictable outcomes. The deeper you go, the tighter the matchups, and the more a single shot can end everything. A model that's 67% accurate over 60 games is finding real signal. A model that's 50% in the Elite Eight is flipping coins, which is roughly what the Elite Eight has always been.

Picking the higher seed in every game would get you about 65% most years. The model is beating chalk, but barely, and the margin gets thinner every round.

What the Model Got Right

It identified Arizona and Michigan as legitimate Final Four teams before the tournament started and never wavered. It correctly flagged every high-confidence Sweet 16 pick. It knew which games it was uncertain about, and those were the ones that went sideways. The model's calibration, the relationship between its confidence and its accuracy, has been its most impressive feature.

What the Model Got Wrong

Houston. The model was contrarian on Houston from the start, rating them 4.6 points above Vegas odds, giving them 14.8% championship probability. Houston lost in the Sweet 16 to Illinois, shooting 34% and getting to the free throw line twice. The model loved Houston's defensive metrics. Illinois played better defense.

Duke. The model's championship pick at 28.1% odds. Duke did everything right for five games, then lost on a shot that had maybe a 5% chance of going in with the game on the line. That's not a model failure, that's March. But it's also a reminder that running your championship prediction through one team at 28% means there's a 72% chance you're wrong, and the model was.

Tennessee. Given 0.2% championship odds before the tournament, Tennessee made the Elite Eight. The model essentially said "this can't happen" and it happened. Low-probability events occur in a 67-game tournament. That's the whole point.

The Verdict

Forty for sixty. Two of four Final Four teams. Zero of one championship picks still alive. The model found signal where signal existed and got wrecked where chaos reigns, which is roughly the experience of everyone who has ever filled out a bracket.

I'll run the model one more time with the actual Final Four locked in and publish championship odds before Saturday. But at this point, the machine and I are in the same boat: watching four teams play, pretending we know what's going to happen, and knowing we probably don't.

The full code, data, and live tracker are on GitHub.

Jason's Blog:

Featured posts:

The Model vs. March Madness: The Bracket Breaks

The Cumulative Scorecard