• Best Predictor of Wins? A (simple) Statistical Analysis

    NFL people have always talked about how "Offense wins games, but defense wins championships", but is that really the case? I decided to take a look at the numbers from last season (regular-season only) and see if I could find the best predictors of the total number of wins. My thoughts are that the best three would be offensive yards per game, yards per game allowed, and Turnover margin. I also looked at yards per game differential (YPG scored minus YPG allowed) to control for different styles of play. Looking at the plots, it appears that all four are good predictors.

    I decided I would do a regression analysis on these data (basically, I'm just fitting the expected number of wins as a function of offensive and defensive YPG and turnover margin). What I found was that offense does indeed win games, and turnover margin is HUGE. Here's part of the summary (basically, a smaller p-value indicates a stronger predictor of wins):

    Estimate p-value
    Defyds -0.014258 0.279215
    Offyds 0.042564 0.000201
    TOmargin 0.114381 0.021892

    In statistics, we generally consider a p-value of less than 0.05 to be an indicator of a strong predictor. Using that criterion we see that Offensive YPG and turnover margin are the best predictors of wins, while YPG allowed isn't as significant. Given the current state of the NFL, with the emergence of big-time passing offenses, this shouldn't be that surprising. One other interesting thing to note: the estimate of 0.114 for turnover margin. Essentially what this means is that for any given YPG gained and allowed, a one-turnover increase in TO margin amounts to an expected 0.114 additional wins. Put another way, a change of 9 turnovers either way in a team's season TO margin is worth about one win!

    I decided to do one more analysis. This time instead of using offensive and defensive YPG as predictors, I decided to use YPG differential. This allows me to control for different styles of play (e.g. the Jets win games a lot differently than the Colts or Saints) as well as for different field and weather conditions. YPG differential only considers the difference in yardage totals for the two teams in every game. I fit a model to last year's NFL win totals based on YPG differential and TO margin, with the following results:

    Estimate p-value
    ypgdiff 0.031407 0.000153
    TOmargin 0.107999 0.048150

    Looks like YPG differential is a very strong predictor of win totals! For any given season TO margin, we expect roughly one extra win for every 32-yard increase in YPG differential, and for a given YPG differential, again a 9-turnover change in TO margin is worth about one extra win. So there are the numbers: if you can outgain your opponent in win the turnover battle, you have a great chance of winning (but you already knew that!).

    Finally, a note about this analysis: do not be scared, it's really not that complicated. I did it all in about a half hour on my home computer, with some Excel and a free software package called R. I hope I was able to make it as simple as possible, but if you're confused by something just reply and I'll try to explain it further. And as always, remember correlation does not imply causation!

    This article was originally published in forum thread: Best Predictor of Wins? A (simple) Statistical Analysis started by cobber66 View original post

    Comments 1 Comment
    1. BisonTribe's Avatar
      Bravo on actually using statistical analysis to try and draw some conclusions. A much better approach than pure and baseless prognostication in my opinion.

      I think you'd have to run a much larger data set to get viable results, as 1 season really is insufficient. I think you'd have to go back to the season after they began re-emphasizing the "chuck rule" to get a data set that'd be representative of trends in football today. I also think it might be worthwhile to consider a much wider set of variables (years experience for different position groups, avg salaries, play calling statistics, etc).

      If you really enjoy doing this kind of analysis, I'd recommend getting Minitab, which makes it a little easier to crunch regressions than Excel and its pretty easy to import data. If you're interested in pursuing this type of analysis a little further, let me know.