Unlock More Content Like This With a Footballguys Premium Subscription
"Footballguys is the best premium
fantasy football only site on the planet."
Matthew Berry, NBC Sports EDGE
Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If the metric I'm focusing on is touchdown rate, and Christian McCaffrey is one of the high outliers in touchdown rate, then Christian McCaffrey goes into Group A, and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of my predictions from 2019 and their final results, here's the list from 2018, and here's the list from 2017.
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I talked about how the ability to convert yards into touchdowns was most certainly a skill, but it was a skill that operated within a fairly narrow and clearly-defined range, and any values outside of that range were probably just random noise and therefore due to regress. I predicted that high-yardage, low-touchdown receivers would outscore low-yardage, high-touchdown receivers going forward.
In Week 5, I talked about how historical patterns suggested we had just reached the informational tipping point, the time when performance to this point in the season carried as much predictive power as ADP. In general, I predicted that players whose early performance differed substantially from their ADP would tend to move toward a point between their early performance and their draft position, but no specific prediction was made.
In Week 6, I talked about simple ways to tell whether a statistic was especially likely to regress or not. No specific prediction was made.
In Week 7, I speculated that kickers were people, too, and lamented the fact that I'd never discussed them in this column before. To remedy that, I identified teams that were scoring "too many" field goals relative to touchdowns and "too many" touchdowns relative to field goals and predicted that scoring mix would regress and kickers from the latter teams would outperform kickers from the former going forward.
In Week 8, I noted that more-granular measures of performance tended to be more stable than less-granular measures and predicted that teams with a great point differential would win more games going forward than teams with an identical record, but substantially worse point differential.
In Week 9, I talked about the interesting role regression to the mean plays in dynasty, where the mere fact that a player is likely to regress sends signals that that player is probably quite good and worth rostering long-term, anyway. No specific prediction was made.
In Week 10, I explained why Group B's lead in these predictions tended to get smaller the longer each prediction ran and showed how a small edge over a huge sample could easily be more impressive than a huge edge over a small sample. No specific prediction was made.
In Week 11, I wrote that yards per pass attempt was an example of a statistic that was significantly less prone to regression, and for the first time I bet against it regressing.
|Statistic for regression||Performance before prediction||Performance since prediction||Weeks remaining|
|Yards per Carry||Group A had 3% more rushing yards per game||Group B has 36% more rushing yards per game||Success!|
|Yard to Touchdown Ratio||Group A averaged 2% more fantasy points per game||Group B averages 40% more fantasy points per game||Success!|
|TD to FG ratio||Group A averaged 20% more points per game||Group B averages 36% more points per game||Success!|
|Wins vs. Points||Both groups had an identical win%||Group B has a 4% higher win%||Failure|
|Yards per Attempt||Group B had 14% more yards per game||Group B has 34% more yards per game||3|
Group B entered the final week of our team wins prediction in good shape, but things went off the rails at the end. Other than four intra-group games (where it tautologically went 2-2), Group B went 1-3 in its remaining games, while Group A went 4-1. Worst of all, Group B was 0-3 in games against teams from Group A.
Had it won a single one of those games, it would have hit the 10% threshold needed to succeed. Was this a bad beat, then? No; had Group A won one more cross-group matchup, it would have finished with a better winning percentage than Group B entirely. Luck runs both ways.
Did the bias I identified last week (where games between two teams in the same group pulled each group's winning percentage closer together) make a difference? Yeah, but not enough to matter; removing games within the group only boosted Group B's edge to 5%, still well short of the 10% edge I needed to declare victory. Nope, this was just a good old-fashioned loss for the forces of regression. You can't win them all.
Our other prediction had a strong opening week to help ease the sting. At the time of the prediction, Group A was throwing significantly more passes, but Group B was averaging more yards per game thanks to a much higher yard per attempt average. Last week... Group A threw significantly more passes, but Group B averaged more yards per game thanks to a much higher yard per attempt average. Going forward, I'd expect more of the same.
Setting The Pace
Let's talk about "on pace" stats. For example: through eight games, Russell Wilson had thrown 28 touchdown passes, which left him on pace to throw 56 over a full season, which would have been a new NFL record.
I love pace stats. They're fun. I love football because it tells a story, and I think record chases are a big part of that narrative fabric. I love the "will he or won't he" aspect, the suspense, the anticipation.
I love looking at young players and imagining that pace stats give us a sneak peek at what we'll be saying about them this offseason. Justin Jefferson is on pace for 1350 receiving yards. At 18.8 yards per reception! What will we say about him this offseason if he finishes with those kinds of numbers?
But as much as I love them because of the stories they tell, I also recognize that they're just that: stories we tell. Regular readers are probably unsurprised that Russell Wilson has fallen off of his record-setting pace. He's thrown just two touchdowns in the last two weeks and is now "only" on pace for 48 on the season— a wildly impressive total, but not a record.
In fact, most players who are on pace to set records do not wind up setting records, because regression to the mean is simply delayed and not deferred. But an underrated corollary is that many players who wind up setting records were not on pace to set records. Indeed, their regression-busting performances came late in the year, closing out a season that didn't look quite so promising at the midpoint.
I wanted to run through a list of some of the most hallowed seasons in history, looking both at where the player finished the year, but also where he was on pace to finish after ten games.
The Passing Touchdown Record
- 1984 Dan Marino (48 passing touchdowns): Marino was fairly consistent through the year, but through ten games he was only on pace for 46 passing touchdowns.
- 2004 Peyton Manning (49 passing touchdowns): Manning was on pace for 56 touchdowns through ten games (and 60 touchdowns through eleven!) His falloff was partly due to regression, partly due to sitting more often in the 4th quarter down the stretch, and most especially due to the fact that he only essentially played 15 games, attempting just two passes in the Colts' 16th game as he rested for the playoffs.
- 2007 Tom Brady (50 passing touchdowns): Brady was actually on pace for 61 touchdowns through ten games, but the Patriots offense as a whole cooled down considerably over the second half and he finished with 50, instead.
- 2013 Peyton Manning (55 passing touchdowns): Through ten weeks, Manning was on pace for 54 touchdowns, one shy of his final total.
- 2018 Patrick Mahomes II (50 passing touchdowns): Mahomes didn't break the record, but he made a strong run and finished with the second-most touchdowns of all time. He was on pace for 50 touchdowns through ten weeks, exactly where he finished.
2000 Yard Rushing Seasons
- 1973 O.J. Simpson (2003 rushing yards): Simpson was "only" on pace for 1852 rushing yards through ten weeks, but exploded for 480 yards over his next three games (at 7.9 yards per carry!). He entered the final week needing 197 yards to hit 2000, and his team gave him 34 carries to get there.
- 1984 Eric Dickerson (2105 rushing yards): Through nine games he was on pace for 1712 rushing yards before rushing for 1044 yards in his next six games. That left him over 2,000 before even playing the last game of the year.
- 1997 Barry Sanders (2053 rushing yards): Through ten games he was on pace to finish with 1765 yards before closing the season strong with 950 yards in his last six games.
- 1998 Terrell Davis (2008 rushing yards): Davis is the only player on this list who started stronger than he finished. He was on pace for 2128 yards after ten games before finishing below 100 rushing yards in three of his final six games.
- 2003 Jamal Lewis (2066 rushing yards): Jamaal Lewis was on pace for 1995 rushing yards through ten games, which makes his season seem more consistent than it was; he had a huge start and a huge end to the year, but a bit of a lull in the middle.
- 2009 Chris Johnson (2006 rushing yards): Johnson was a slow starter. He was held below 100 yards in four of his first five games, though by Game 10 he'd already made up all that lost ground and was on pace for 1987 yards.
- 2012 Adrian Peterson (2097 rushing yards): The first guy I always think of when I'm talking about pace stats, through six weeks Peterson's best game was just 102 yards rushing and at midseason he was widely considered the second-most valuable offensive player on his own team (after Percy Harvin, who was having a monster first half). Then Harvin got hurt and Peterson went supernova. By his 10th game, he was still only on pace for 1805 rushing yards. Peterson averaged 83 rushing yards in his first six games and 160 rushing yards in his last ten.
The Receptions Record
- 1984 Art Monk (106 receptions): Today, there have been 117 seasons with 100 or more receptions. At the beginning of 1984, there had been zero. Through ten weeks, it looked like there would still be zero; Monk was on pace for 93 receptions and had been trending down. He averaged 8 receptions per game down the stretch to set the record.
- 1992 Sterling Sharpe (108 receptions): Sharpe had established himself as a good bet to break the record by Week 10, when he was already on pace for 109 receptions.
- 1993 Sterling Sharpe (112 receptions): One of the most consistent seasons in NFL history. 112 receptions works out to 7 receptions per game on average; there wasn't a single 6-week stretch of Sharpe's season where he didn't average at least 7 receptions. In 11 of his 16 games he had either six or seven receptions. (In the other five, he had 10, 10, 10, 5, and 4.) And, of course, through ten games he was on pace for exactly 112 receptions.
- 1994 Cris Carter (122 receptions): Carter had a lot more highs and lows than Sharpe the year prior. He had seven games of 9 or more receptions and six games of 5 or fewer receptions. But he was still on pace for 123 receptions through ten games.
- 1995 Herman Moore (123 receptions): 1995 featured a bevy of receivers posting the kind of numbers the league had never seen. Among the numerous records to fall was the receptions record once again, though Herman Moore didn't look like the most likely candidate to take it down after ten games, when he was "only" on pace for 114 receptions.
- 2002 Marvin Harrison (144 receptions): The reception record fell four times in four seasons from 1992 to 1995 but then remained relatively unchallenged until 2002. There was little suspense in this one, as Harrison was on pace for 142 receptions by the Game 10 mark.
- 2009 Wes Welker (123 receptions): Welker never really had a shot at the record because he missed Week 2 and 3, but by the Patriots 10th game Welker's 16-game pace would have worked out to a mind-boggling 158 receptions, and his 14-game pace (to account for the missed games) was 138.
- 2015 Julio Jones and Antonio Brown (136 receptions each): Through ten games, Jones was on pace for 142 receptions and Brown was only on pace for 126, but Jones fell off a little and Brown stepped up a lot and both finished at 136.
- 2019 Michael Thomas (149 receptions): This one is probably pretty fresh in everyone's minds, still. Through ten games, Thomas was on pace for 150 receptions and widely expected to set a new record.
What Does It All Mean?
What should the takeaway be from this trip down memory lane? We could try to find patterns in the data. Passing touchdowns seemed more likely to fall off pace over the final six games, rushing yardage tended to increase over pace, while receptions tended to hold pretty steady. Is this a meaningful pattern? Perhaps. (I think a lot of it is probably selection bias, personally.)
Rather than looking for patterns, I look at history like this as a reminder of the range of possibilities. Sometimes players who are blazing hot cool down a little. Sometimes they maintain their level of performance. Sometimes they even kick it up a notch. Sometimes the record is set by the guy who built up a huge buffer early in the year, while other times it's set by the player who exploded down the stretch.
Football is a pretty random sport, and fantasy football is more random, still. This column is dedicated to the idea that past performance is no guarantee of future performance. At best, it is a weak indicator.
As we close out the regular season, we know that championships will be won by the players that get scorching hot in Weeks 14, 15, and 16. And despite our best efforts, we don't have any way of knowing which players those will be. Pace stats are a lot of fun and I love the stories they tell, but I don't always love the illusion of certainty they provide, the false confidence that the last part of the season will closely resemble the first part.
That's not how randomness works. Which is a good thing for this column, because otherwise, we'd have nothing to talk about.