Analytics Blog

Dsg Subscribe Mail Web

Blogs (Analytics Blog) (Analytics Blog)

Dunn Solutions Data Experts Predict Royals Win World Series


Dongyang Li
4 Months Ago

How often do you get to mix work with fun?  Well, if you work at Dunn Solutions with SAP products, all the time.  But when we got the opportunity to enter the “Early Bird Baseball Challenge” we got a chance to take fun to the next level- predict who was going to win the 2014 World Series!

Obviously, lots of people try to predict who is going to win and for many reasons (just go to Las Vegas and you will see).  This prediction is for pure fun and for exposure of some cool SAP technology: SAP Predict Analysis.  We decided to approach this in a different way, so I was elected to do the predictions.  “How is that different?”  Well, I didn’t know anything about baseball!  But I do now and I will be watching the World Series to see how good we are at predicting baseball outcomes.

If I look at this from the American League side, I nailed all the playoff winners.  From the National League side, San Francisco messed up my predictions… who would think that the weakest, wild card team would end up in the World Series?  Regardless of this, The Royals are too strong (based on my power index) to be taken down by a Cinderella wild card team.  (See, I have learned a little about baseball).

So, how did we do it?  We first accrued data for all the teams in the playoff games from the last five years, then calculated the average of each team’s batting and pitching statistics. It’s important to note, that we separated a team’s offensive power from their primary defensive power (pitching).  From this we would create our power index (which was a weighted combination of the two facets). 

We split the data by leagues, two datasets were used to train the “Multi-Regression” algorithm in SAP Predictive Analysis separately to determine which statistics influenced the percentage of wins to what extent.  The regression functions generated by the trained model for each league then was used with this year's data accordingly to determine the strength of each team.

We predicted the outcome for each matchup based on the metric for team strength generated, and successfully picked 3 out of the 4 team to make it to the League Championships.  According to our prediction, The Kansas City Royals and The St. Louis Cardinals would have been in the World Series.   Moreover, in order to determine the final champion, we built up another model fitted for both leagues which predicts that the The Kansas City Royals catch the trophy!

Turns out the Cardinals choked, and the San Francisco Giants will face off with the Royals.  The Giants did much better than our analysis predicted. Why is that?  Our analysis shows that they are a very well rounded team, as opposed to teams like the Cardinals, who have great batting stats, but weaker pitching stats.  This would mean that a temporary slump in one of these areas would have less of an overall effect on the Giants, putting them ahead of our analysis. 

We still predict the Royals to win because they are also a well-rounded team and they have great pitching stats.  Besides, my boss tells me I should be an American League girl and so the Royals it is… Go Royals.