Friday, March 14, 2014

2014 MLB Preview based on Projected Statistics

In response to the feedback regarding the MLS Projected data - I used the same model to project a sport more connected with statistics and analytics - Baseball and the upcoming 2014 Major League season.   

The basis of the work is taken from Bill James projected work where the data reflects the players previous results with the most emphasis on the current season. The analysis could be advanced and one of the greatest issues with this model is it doesn't take strength of schedule into account.

The biggest advantage to Baseball is the availability of data. In terms of the Soccer work I have to tabulate the information myself which in many ways restricts depth of data. Well a great amount of baseball data is available with me getting my data for this project from  

There is also a lot of comparable data to this work especially related to fantasy baseball games although one issue I have with that work which I tried to fix, is the accumulated projected data for each player per team doesn't reflect how the team should perform. So I made my work fit within the expected innings of each team and the projected run score plus/minus for all teams is equal. So I was also able to project an expected record for each team based on Bill James Pythagorean W-L 

Enjoy the Work.


  1. Has this been back tested? If so what is the expected accuracy? Or do injuries make this moot, though it appears with the predicted games played by each player that injuries are baked into the numbers on some level.

  2. The data is true to the numbers it is an historical representation of what the player has done in the past so for example if the player is injury prone in his career it will predict a smaller percentage of games. I did not manipulate the data in anyway only in the case of rookies who are going to get significant playing time where the stats are a reflection of the players minor league data and taking into account the jump in quality.

    I worked in a number of sports related fields including the betting industry and these numbers are developed in creating over/under props bet and 20 years ago worked on a video game where a model like this was developed to create simulated play. If you did use these stats and simulated the results it should turn out the same standings as I predicted for each team where traditional fantasy stats predictions would not.