What Are Baseball Data Analytics (And Why Should Fans Care?)

Last Updated on December 23, 2021 by Lil Ginge

What are baseball data analytics and why should they matter to baseball fans? Baseball data analytics – also called “sabermetrics” – are an attempt to analyze the true impact of a player’s performance on the baseball field in all aspects of the game – hitting, pitching, base running, and defense. In essence, it is the empirical or scientific analysis of baseball events.

The idea behind data analytics is that traditional baseball statistics such as batting average and pitcher win-loss records are highly flawed ways of evaluating a player. They tell you very little about the actual quality of a player’s performance.

In this article, we’ll take a brief look at the history of baseball data analytics, some of the key components of sabermetric stats, and a look at some of the most important statistics themselves. We’ll also examine why some of the older stats are so flawed.

History of Baseball Data Analytics

To some degree, baseball analysts have always tried to statistically capture what is actually happening on the field in the game of baseball to be able to measure an individual player’s performance. This has led to the creation of things like the box score in the 1800s and traditional baseball stats like batting average and pitcher wins.

However, some savvy baseball minds in the mid-twentieth century began to realize that the traditional baseball stats did a poor job of evaluating an individual player’s contribution. So they began to seek out new ways to evaluate how a player performed through statistical analysis.

In 1947, the Brooklyn Dodgers – under the guidance of owner Walter O’Malley and President and GM Branch Rickey – hired a statistician named Allan Roth. Roth developed advanced stats that helped to improve the team’s decision-making apparatus. This included how the team went about making trades. Roth would work year-round, both during the season and off-season, to statistically advance the team’s goals.

In the mid-twentieth century, Earnshaw Cook began doing advanced baseball analysis and wrote a book called Percentage Baseball, published in 1964. In the beginning, most teams and baseball pros dismissed the book as baseball quackery.

But along came Bill James and things began to change. In 1971, James founded SABR – the Society for American Baseball Research – which is where the term “sabermetrics” originates. James also started publishing annual Baseball Abstracts, and by the later 1970s, James’s baseball ideas began to spread (albeit slowly).

Data Analytics and Moneyball

The concept of baseball data analytics finally went mainstream and blew up with the publication of the now-classic book Moneyball: The Art of Winning an Unfair Game by Michael Lewis. The book tells the story of how the 2002 Oakland Athletics under Billy Beane and Paul DePodesta used the advanced stats and baseball concepts created by Bill James and his sabermetrics systems. Bean and DePodesta used these stats to find ways to replace costly superstars who left for free agency with inexpensive players highly undervalued by the market. 

The result of Bean and DePodesta’s experiment was a 103 win season and a playoff berth all while having the 6th smallest payroll in baseball. This changed the way sabermetrics were viewed in baseball and the book Moneyball even went on to become a highly successful and celebrated film starring Brad Pitt and Jonah Hill as Bean and DePodesta. You know you’ve made it big time when you are played by Brad Pitt.

These days, every team in baseball now has a data analytics department. And the best teams tend to rely on them heavily. The mainstream public still looks at traditional stats like batting average, RBIs, pitcher wins, and errors, but sabermetric stats have also entered the popular lexicon for both sportswriters and fans. It is just as common to see fans debating Wins Above Replacement (WAR) and Fielding Independent Pitching (FIP) on the internet as batting average and earned run average.

Many people have now come to see the traditional baseball statistics as highly flawed for a wide variety of reasons. One great example is that batting average counts all hits – including singles and home runs – as equally valuable when that’s patently untrue, and excludes events like walks and hit-by-pitches altogether.

Similar RBIs are largely driven by the performances of players surrounding the individual player and less on how well the individual player is hitting themselves. You’ll have a lot more RBI opportunities if you hit behind Vladimir Guerrero Jr. than if you hit behind a far lesser player. The public and baseball professionals have both become far more aware of the limitations of such stats and have adjusted their baseball data analytics as professionals and fans accordingly.

Some Important Baseball Data Analytics 

There are many different baseball analytics that you should know if you want a better understanding of the game. Some of the most important ones include the following:

  • wOBA (weighted on-base average)
  • wRC+ (weighted runs created plus)
  • FIP (Fielding Independent Pitching)
  • Exit Velocity
  • WAR (Wins Above Replacement)

Let’s take a look at each of these important baseball statics:

wOBA (Weighted On-Base Average) – The benefit of weighted on-base average is that more credit or a heavier weighting is given to extra-base hits than singles. And slightly less credit is given to walks or hit by pitches. wOBA does a better job of capturing what you might traditionally try to understand by looking at a combination of batting average, on-base percentage, and slugging percentage. A .320 wOBA is about average, whereas .400 is excellent and .290 is bad, as the stat is scaled like traditional on-base percentage.

wRC+ (Weighted Runs Created Plus) is a park-adjusted (that’s what the + stands for) version of wOBA. wRC+ is scaled such that 100 equals a league-average performance. The higher the number, the better the offensive performance. The metric is also adjusted for time era, so you can compare players from one hundred years ago to players today. While it may be slightly less accurate than wOBA because park effects can only be estimated, it may still give you a more accurate context-free analysis of how a player has actually performed. 

Exit Velocity – Exit velocity measures the speed with which a ball leaves after being hit by the bat. It essentially measures how hard the baseball was hit. Hitting the baseball hard is generally a good thing and the higher the exit velocity, the harder it was hit. An exit velocity of 95 miles per hour or above is considered to be a hard-hit ball. Exit velocity can also be combined with launch angle which measures the extent to which a ball is a groundball, line drive, or fly ball.

FIP (Fielding Independent Pitching) – FIP is a new version of earned run average (ERA) that strips each event of the defense’s contributions. In other words, only the aspects of the game that a pitcher truly controls as an individual – strikeouts, walks, and home runs – are included in the metric. The stat assumes that a pitcher will have average luck on balls put into play based on a team’s defense and such. The metric is scaled to ERA so a 4.20 FIP is league average and 3.00 FIP is very good.

WAR (Wins Above Replacement). WAR is an attempt to measure an individual’s entire contribution with just one stat and can be applied to both position players and pitchers. WAR is an estimation and only gives you a rough idea of a player’s total value. It should be combined with many other stats for a fuller picture of player performance. 

Because WAR is meant to be an estimation, there are disputes about the best way to calculate it. Two versions of WAR have become industry standard: Fangraphs WAR (fWAR) and Baseball-Reference WAR (bWAR). They can be quite different so both can be looked at together and in context to better understand a player’s total performance.

Final Thoughts on Baseball Data Analytics

In the contemporary era of baseball, knowing advanced data analytics has really become essential in better understanding what’s actually happening on the baseball field. You can enjoy the game without them, but your assessments of what’s actually happening on the field and why a team is winning or losing may be pretty off.

Because the profession of baseball now acknowledges this, data analytics are going to remain increasingly important to front offices across the sport, and there will likely be an ever-growing stream of baseball analytics jobs created in the future of the sport.

Finally, it might seem like dry math, but having a deeper understanding of the events of a game of baseball true makes it more interesting and insightful. It’s fun to know what’s actually happening on the field in a way that goes beyond the old calling of balls and strikes.

1 WAR means that a player was worth 1 win above your average minor league callup, so this is a positive contribution. 3 WAR is a solid full-time player, while 5 WAR is generally seen as all-star calibre and 7 is MVP-level. Mike Trout is his own thing.

Bookmark the permalink.

One Comment

  1. Pingback: The Andruw Jones Hall of Fame Case | Lil Ginge

Leave a Reply