I Built an ABS Challenge Tracker. Here’s What the Data Shows.

I used to be a baseball purist, but I really love the innovations the game has added in the past few years. The pitch clock has made baseball much more enjoyable to watch. Now ABS has added a dimension to the game that holds umpires accountable and allows teams to have some agency in what the strikezone is and should be.

Because I am curious how this is going, I built a dashboard that tracks every ABS challenge of the season and refreshes itself every morning. The team and player numbers come straight from Baseball Savant. The umpire numbers were harder, because Savant does not publish an umpire leaderboard for challenges. So I built that part myself: pull each challenged pitch from the game feed, look up who was behind the plate that day, and tie the two together. About 3,900 challenges across roughly 950 games so far, stitched to the right umpire one game at a time.

The whole thing runs on autopilot through a scheduled job, commits the fresh data, and redeploys. I do not touch it. You can look at it here.

Who is actually good at it

A few things jump out once the data piles up.

League-wide, about 52.8% of challenges get overturned. That is the number I keep coming back to. Half the time the human was right, half the time the robot disagrees. It is a coin flip with a faint lean toward the challenger.

Teams are not equal at it. Detroit is the sharpest in baseball at around 60%, with Texas and Arizona right behind. Pittsburgh is the worst at about 43%, which means the Pirates are mostly challenging pitches the robot was happy to call the way they were called.

Umpires are not equal either. The most-overturned crew chief gets reversed on roughly 72% of his challenged calls. The steadiest umpires sit around 33%, meaning their calls hold up two times out of three. That is a real spread, and it is exactly the kind of thing nobody could measure before this year.

And then there is the position breakdown, which is my favorite. Catchers do about half of all challenging and win most often, near 59%. Hitters are middle of the pack. Pitchers almost never bother, and when they do they are wrong two-thirds of the time. On every single team, the most active challenger is the catcher. The guy who frames pitches for a living turns out to be the best at knowing when the call was wrong. That is not a surprise once you say it out loud, but it is satisfying to see it fall out of the numbers.

The part I really wanted to know

Here is the question that actually pulled me in: does being good at the robot game help you win baseball games?

No. Not yet in the numbers we can see. The correlation between a team's challenge success rate and its winning percentage is about negative 0.05, which in plain terms is nothing. Detroit leads the league in challenge accuracy and is sitting around .406.

The scoreboard for the scoreboard

The challenge system was supposed to settle arguments. It did. It also quietly opened up a brand new one, because now we can rank the umpires, rank the teams, and watch the catchers run the whole operation.

So I am keeping score on the thing that keeps score. It updates every day, in case you want to check whether your team is any good at arguing with a robot.