Modeling Pitch Count States as a Markov Process

The 1906 World Series is perhaps one of the best examples of the adage "good pitching beats good hitting". This historic Windy City matchup pitted arguably the most dominant team in MLB history, the 116-36 Chicago Cubs against their city rivals, the Chicago White Sox. The contrast between the teams was stark. The White Sox were a pitching-first team with little offense to speak of. In fact, they hit a mere 7 home runs through the entire season. Yes, the White Sox hit 7 home runs as a team in 1906 , albeit during the deadball era. Even still, this was a third of the home runs hit by the Cubs hit that year. Meanwhile, to put in perspective how good this Cubs team was, they still hold the record for the highest season winning percentage (0.763) all time. It would take 95 years before the 2001 Seattle Mariners would match their win total, though the Mariners required 10 more games to do it. And poetically, just like the 2001 Mariners, the 1906 Cubs would not win the World Series in their record-setting year.

So how did the unimposing White Sox upset the Cubs? Pitching. Well probably some luck too, but that's not as fun to talk about. The Sox held the Cubs to an average of 2.83 runs/game over the course of the series, down from the Cubs season average of 4.57 runs/game, eventually beating them 8-3 in game 6 to seal the championship. The Sox accomplished this through a young, spitballer fueled rotation headlined by hall of famer Ed Walsh. You may know Ed Walsh as the all-time record holder in career ERA at 1.82 (insane!). Less well known, but equally impressive, he also holds the record for career FIP at 2.02. Relative to today's standards, Ed Walsh's 5.5 SO/9 in 1906 may not seem very impressive, but compared to the league average 3.6 SO/9, Ed Walsh was a punchout master. And he did this while averaging almost a full walk and hit less per 9 than his contemporaries. Given these statistics, I find it interesting how we can start to imagine the type of pitcher Ed Walsh was: A dominant strikeout specialist with exceptional control. He was a statcast legend 100 years before statcast existed.

In the modern era, we create these pitcher typologies all the time, and eventually they come to define the way we think about players. "Kyle Hendricks is a contact first pitcher with great control" or "Blake Snell is great at striking batters out but has a propensity to allow too many walks". While part of these assessments is gleaned from the eye test, by and large these conclusions are reinforced using stats like BB/9, or K/9, or WHIP, or FIP, etc. I love that we can start crafting characters and storylines using just a few stats like this. It's one of the beautiful parts of baseball. But I wanted to go even deeper to ask the question, can we get more specific to paint an even more detailed picture? I wanted to see if we could look beyond outcome-based stats like ERA, FIP, K/9 etc. to understand differences between pitcher typologies before the outcome. How do pitchers compare during and within a Plate Appearance? But how? The answer: Markov Chains.

Markov Chains and Pitch Counts

Ok so we want to model more specific tendencies in pitching approach during Plate Appearances. What does that have to do with the 1906 World Series, Ed Walsh, and the title of this post? I will admit the connection is a bit weak, so bear with me. 1906 is not only the year that Ed Walsh and the White Sox took down the Cubs, but it also happens to be the year that Andrei Markov published his first paper on Markov Processes, a new way to model sequences of stochastically occurring events. While Andrei Markov almost certainly did not know or care about the 1906 World Series, he provided a tool in Markov Processes that we can use to create more specific pitcher profiles. His innovation is captured in an interesting property of Markov Processes, namely that the probability of the next event depends only on the state you're currently in. In other words, it doesn't matter how you got to where you are, what happens next is only a function of where you are now. While it would be easy to get philosophical with this as a maxim for life, this is a baseball blog, so instead we're going to apply it to baseball statistics. A pitch count just so happens to follow this property, with counts (i.e., 0-0, 1-0, etc.) representing states and the pitches representing events that move us between states. If we can agree that it generally doesn't matter if you got to a 2-2 count by first fouling off the first two pitches and then taking two balls or if you took the two balls first and then swung through the next two, then we can begin to model this process as a Markov Chain. I will admit this is an assumption that can be challenged; Pitch sequencing does matter, and pitchers will change their approach based on how one got into a specific count. But if we can accept that this is an approximation, it opens up some very cool modeling opportunities.

The State Transition Matrix

We can intuit from Ed Walsh's baseball reference page that he probably got into many pitcher-friendly counts given his low walk rate, low amount of balls in play, and high strikeout rate, but we can get a bit more specific if we employ Markov Chains. In fact, using the principles that Andrei Markov developed, we can map how pitchers move through counts by treating different counts as states, and the strikes and balls that are thrown as transitions between those states. For instance, if I want to know the rate at which a pitcher moves from an 1-1 count to a 1-2 count, I can do this by finding all the 1-1 counts that pitcher found themselves in and then the rate at which they moved it to a 1-2 count. If I repeat this for all possible pitch count states, we can create a matrix of all the possible state transitions for a pitcher. This matrix is referred to as a state transition matrix and is key for any Markov Process. It's a symmetrical matrix with the rows and columns representing all the possible states of the system. In our case, each possible count state will represent both a row and column (i.e., 0-0, 1-0, 0-1, 1-1...). The cells within the matrix tell us the probability of moving between states. For example, the intersection of the 0-0 row with the 1-0 column represents the probability that a pitcher throws a ball on the first pitch. Notice, that some transitions are impossible. I can't go from a 1-0 count back to an 0-0 count, thus this probability is 0. We also have terminating states. When we reach these, we leave the system. In the context of a pitch count, these termination states would be a walk, strikeout, or ball-in-play. Once you enter one of these states, you can't go anywhere else.

In order to create this matrix, we need to do some data engineering. Essentially, we need to look at play-by-play data to see how a pitcher moved from state to state for a game, season, or career, depending on the scale we want to analyze. I did this using play-by-play data source from retrosheet.org. They record pitch sequence data using sequences of letters such as BBCSX which in this case would represent ball, ball, called strike, swinging strike, ball in play. I parsed these retrosheet sequences and then computed the rates at which a pitcher moved to each state relative to the total number of pitches thrown from the previous state. Unfortunately, we don't have this play-by-play data going back to 1906 so we'll never know what Ed Walsh's championship state transition matrix looked like. So, let's go more contemporary. In 2024, across all pitchers league-wide, the state transition matrix looked like this:

This matrix tells us that in 2024 across the league a ball was thrown 37.4% of the time on the first pitch, a strike 51.1% of the time, and 11.5% of the time it was put into play by the batter. There's some other interesting takeaways to be found here. As one might expect, a 3-0 count is when a strike is thrown at the highest rate and only 4.2% of pitches are put in play; Undoubtedly because batters almost always take in this count. A full count has remarkably even probabilities of any outcome happening: walk, strikeout, foul ball, or a ball put into play. This is the more detailed representation of pitcher profiles we were looking for. With this approach, we can start to create maps of how the league, or specific pitchers, move through pitch counts.

And, now that we have the league-wide state transition probabilities, we can see how individual pitchers compare to the average outcomes. We use the same process of parsing play-by-play data, this time filtering for specific pitchers. Here's what Tarik Skubal's 2024 state transition matrix looks like:

When we compare Skubal's state transition matrix to the league-wide matrix, we can see Skubal's colloquial typology reflected in the data. He strikes out a lot of hitters. I mean he K's batters from an 0-2 count at a 10% higher rate than league average. That is elite. And his strikeout rates from any two strike count are 5-10% higher than league average. And he accomplishes this while also walking batters at a rate well below league average. Skubal is a precise killer. If we had Ed Walsh pitch by pitch data from 1906, I imagine his profile would look similar.

Visualizing the Information

Now, while we could stop here, tables aren't necessarily the best representation of this data. We can do better. Let's create a visualization that graphically depicts this "map" as a set of states and pathways between states. For this we'll need a bit of graph theory. All you need to know for the purpose of this article is that states will be represented as "nodes" and the pathways as "edges". It will be a "directed" graph because we can only move one direction through a count. In other words, we can only add balls and strikes, or stay in the same count in the event of a foul ball in a two strike count. We can not remove balls or strikes. The nice thing is since we already did the difficult work in parsing the play by play data and creating our state transition matrices, we can simply use these for the visualization. Some other nice features that that I'd like to have in this visualization are as follows:

The size of nodes should be relative to the proportion of pitches thrown in that count vs the total
We should color-code edges based on the degree to which a pitcher created a "good" or "bad" outcome relative to league average. Good will be red, and bad will be blue.
There should be a feature for toggling between absolute rates and the delta between a pitchers rates and league average

I used plotly for this. It ended up being a more flexible plotting library in this context compared to recharts (my usual go-to).

Result

Pitch State Visualization

A visualization of how different pitchers in 2024 navigated the count

2024 season data for all qualified pitchers. Data sourced from retrosheet event data

I love this visual because you can immediately identify both positive and negative trends that illuminate pitcher-specific pathways through counts. Two pitchers may have the same peripheral stats, but reach those numbers through completely different approaches, strengths, and weaknesses. Anyway, I had a lot of fun making this visualization and I encourage you to mess around with it yourself.

Thank you for reading!

Modeling Pitch Count States as a Markov Process

The rates at which strikes and balls are thrown in each count illuminate within-count pitcher strengths and weaknesses.