
Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes, I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions. On a case-by-case basis, it's easy to find reasons why any given player is going to buck the trend and sustain production. So I constrain myself and remove my ability to rationalize on a case-by-case basis.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared. Here's a similar list from 2017.
The Scorecard
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I explained why touchdowns follow yards, (but yards don't follow back), and predicted that the players with the fewest touchdowns per yard gained would outscore the players with the most touchdowns per yard gained going forward.
In Week 5, I talked about how preseason expectations still held as much predictive power as performance through four weeks. No specific prediction was made.
In Week 6, I talked about why quarterbacks tended to regress less than other positions but nevertheless predicted that Patrick Mahomes II would somehow manage to get even better and score ten touchdowns over the next four weeks.
In Week 7, I talked about why watching the game and forming opinions about players makes it harder to trust the cold hard numbers when the time comes to put our chips on the table. (I did not recommend against watching football; football is wonderful and should be enjoyed to its fullest.)
In Week 8, I discussed how yard-to-touchdown ratios can be applied to tight ends but the players most likely to regress positively were already the top performers at the position. I made a novel prediction to try to overcome this quandary.
In Week 9, I discussed several of the challenges in predicting regression for wide receiver "efficiency" stats such as yards per target. No specific prediction was made.
In Week 10, I proposed a "leaderboard test" to quickly tell whether a statistic was noisy (and more prone to regression) or stable (and less prone to regression). I illustrated this test in action and made another prediction that yards per carry would regress.
In Week 11, I mentioned that many unexpected things were at the mercy of regression to the mean, highlighting how the average age of players at a given position tends to regress over time as incoming talent ebbs and flows.
In Week 12, I predicted that because players regress, and units are made up of players, units should regress, too. I identified the top five offenses, bottom five offenses, top five defenses, and bottom five defenses, and predicted that after four weeks those twenty units would collectively be less "extreme" (defined as closer to league average). Because offense tends to be more stable than defense, I added a bonus prediction that the defenses would regress more than the offenses.
In Week 13, I delved into how interceptions were the only quarterback stat that is mostly noise and predicted that the most interception-prone quarterbacks in the league (yes, including Jameis Winston) would start throwing fewer interceptions than the least interception-prone quarterbacks in the league.
In Week 14, I talked about how big of a role schedule luck plays in fantasy football outcomes and how, as luck in its purest form, it regresses mercilessly.
Statistic For Regression
|
Performance Before Prediction
|
Performance Since Prediction
|
Weeks Remaining
|
Yards per Carry
|
Group A had 20% more rushing yards per game
|
Group B has 30% more rushing yards per game
|
Success!
|
Yard:Touchdown Ratio
|
Group A had 23% more points per game
|
Group B has 47% more points per game
|
Success!
|
Mahomes averaged 2.2 touchdowns per game
|
Mahomes averages 2.0 touchdowns per game
|
Failure
|
|
Yard:Touchdown Ratios
|
Group B had 76% more point per game
|
Group B has 146% more points per game
|
Success!
|
Mahomes TDs Redux
|
Mahomes averaged 2.2 touchdowns per game |
Mahomes averages 2.3 touchdowns per game
|
Failure
|
Yards per Carry Redux
|
Group A had 22% more rushing yards per game
|
Group B has 23% more rushing yards per game
|
Success!
|
"Extreme" performance
|
"Extreme" units were ~6.4 ppg from average
|
"Extreme" units are 90% as "extreme"
|
1
|
Defense vs. Offense
|
|
Defenses regressed 14% more than Offenses
|
1
|
Team Interceptions
|
Group A had 87% as many interceptions
|
Group B has 57% as many interceptions
|
2
|
There are some nice perks when working with bigger data sets, as we are with our team-level predictions. The nicest is that, since we're looking at a unit's rating over the whole season, single weeks don't produce the sort of crazy swings we're accustomed to. Over the last three weeks, the most extreme offenses and defenses have gone from 100% of their baseline "extremeness" to 95%, then to 93%, then to 90%. Defenses have gone from regressing 4% more than offenses, to 7%, then to 14%. Ordered and implacable progressions like this are fun to see from time to time to remind us just how omnipresent and inevitable regression really is.
As for our interception prediction, our "high-interception" quarterbacks have passed our "low-interception" quarterbacks and are once again throwing more interceptions per game, which is what we always expected. But the difference between the two groups has gone from massive (an extra 0.7 interceptions per game) to negligible (an extra 0.1 interceptions per game), and as a result, our "low-interception" cohort continues to throw more interceptions overall.
Is Anyone Special?
Imagine a spectrum of belief in regression to the mean. Not a spectrum of knowledge, but a spectrum of belief. On one end are the disbelievers; they know what regression is about but simply don't believe it offers actionable insight. On the other end are the true believers; they know what regression is about and easily accept its conclusions as true. In my experience, both of these extremes are rare.
Most people who initially seem like disbelievers often just don't know what regression is. I've met a lot of people who think regression to the mean is akin to the old gambler's fallacy, the idea that after a string of extreme results in one direction someone or something becomes "due" for extreme results in the opposite direction to offset.
That's voodoo science and people are right to be skeptical of it, which is why a lot of what I write for this column focuses on explaining exactly what regression is and what it isn't. And I've yet to find someone who continued disbelieving in regression to the mean entirely once they understood the underlying concepts.
At the same time, in my experience "true believers" are just as rare. Most of the people I talk with typically fall in the middle into a group I'd call the "mostly-believers".
A mostly-believer buys all of the theory behind regression to the mean. She knows that outlier performances are likely a result of both talent and favorable breaks and no matter how great the talent is when the breaks disappear the statistics will regress. Mostly-believers typically believe in regression to the mean for every player in the league... with one or two exceptions.
I've been talking to mostly-believers about regression for more than a decade now. Tatum Bell once went on a hot streak for the 2005 Denver Broncos and a buddy of mine asked me if I thought he was a solid fantasy starter going forward. I said that he couldn't keep averaging eight yards per carry and my buddy asked why not, Bell was in a unique role in that offense so what if the rules didn't apply?
In 2011, Calvin Johnson scored two touchdowns in each of Detroit's first four games. His value in dynasty leagues shot up way through the stratosphere, and when I reminded a leaguemate that Johnson couldn't possibly maintain that pace (because we'd never seen anyone approach 32 touchdowns over a full season), my leaguemate informed me that we'd never seen as big of a physical freak as Calvin Johnson before.
I've had far too many conversations to count over the last few years about how no, Julio Jones' touchdown rate was too low and he was going to start scoring again sooner or later. And then after he scored four touchdowns in the first three weeks this year to lead all receivers, I had a surreal conversation where I explained that no, Jones' touchdown rate was far too high and he'd probably score much less going forward. And in all cases, I was met with claims that maybe this just was who Julio Jones was now.
In my first year writing this column, I devoted an entire week to discussing Alvin Kamara after several different people had told me they thought he was "special" and the rules didn't apply to him. From his first year to his second, Kamara's yards per carry and yards per reception both declined substantially but his touchdown rate actually improved (from one touchdown per 15.5 touches to one touchdown per 15.2 touches), and once again people speculated that maybe this just was who he was, maybe he was something entirely unique in NFL history.
Here are the four players with the worst touch-per-touchdown ratio in 2019 (minimum 100 touches):
- 155 touches per touchdown: Kenyan Drake
- 110 touches per touchdown: Alexander Mattison
- 103 touches per touchdown: Alvin Kamara
- 101 touches per touchdown: Leonard Fournette
After Todd Gurley dominated the fantasy playoffs to close out 2017 and then dominated the fantasy regular season to open 2018, I began to hear claims that he, too, was regression-proof. So I devoted a column to him last year, too. We all know how that goes.
This offseason, I had plenty of conversations about Patrick Mahomes II from people who thought he might wind up going down as the best quarterback to ever play. I agreed that he might, but even if he is his 8.6% touchdown rate from last year was guaranteed to regress dramatically. (And then when I thought it had regressed too far past his true talent level in the other direction, I predicted it'd regress positively again.)
Also this offseason, I was talking with another writer on staff about his projections, and he told me that yes, he believed yards per carry regressed, but also maybe Aaron Jones was just so good that his wouldn't. In college, Jones averaged 5.2, 5.5, 6.5, and 7.7 yards per carry. He averaged 5.5 yards per carry in each of his first two seasons in the NFL.
And when I mentioned that the most historically similar players to Jones had a median yard-per-carry average of 4.3 in their next follow-up season and a third were at 4.2 yards per carry or lower, I was told that wasn't a realistic expectation for Jones, it was "too extreme". Jones spent most of this season at around 4.1 yards per carry before a huge game last week finally pulled him up over 4.4.
And then there's me, because I was a mostly-believer too. I first wrote about yard-to-touchdown ratios back in 2015 and my conclusion was that nobody could average fewer than 100 yards per touchdown over a long timeline... except maybe Rob Gronkowski was special and talented and unique enough that he could pull it off. And I was as wrong as everyone else; here's Gronkowski's career yard-to-touchdown ratio after every season of his career:
- 54.6
- 67.0
- 68.3
- 75.7
- 79.7
- 84.2
- 88.4
- 93.3
- 98.3
Yes, he retired with a career average almost exactly where I had predicted the long-run lower limit to be four years earlier, the lower limit that I predicted he'd somehow manage to stay below because he was special.
I've had a lot of different conversations with mostly-believers over the years who told me that regression was obviously a real thing, but maybe just this one player was special and different and unique and would somehow manage to avoid the inevitable. Everyone has a different player, but nearly everyone has a player. And in every case, it has turned out that that one player was not, in fact, special and different and unique enough to avoid regression.
Rob Gronkowski is, for my money, the best tight end to ever play. I say this as someone who spends an inordinate amount of time studying and thinking about the entire 100-year history of the National Football League (though the tight end position has only been around for about 60 of those years). And that's what finally pushed me over the edge into true believerhood. If Rob Gronkowski wasn't special, then nobody is special. When it comes to regression to the mean, no one is special.
The inspiration for the column today is Lamar Jackson. Lamar Jackson is a joy to watch and should be the runaway league MVP this year. I think there's a very real chance that his legacy will include changing the way the game is played entirely. Twenty years from now I think there will be quarterbacks who owe the way they play (or even the fact that they got a chance to play at all) to Jackson.
His production is also unsustainable and is going to regress. Jackson may be a special player, but he is not special. Nobody is special.
So far this season, Jackson averages about 78 yards per game rushing. He averaged 79 yards per game last year as a starter. This is now a 20-game sample. He's probably the best rushing quarterback in league history. But he can be the best rushing quarterback in league history and still regress.
Before Jackson, Michael Vick was the best rushing quarterback in league history and he averaged just 50 yards per game. Before Vick you had Randall Cunningham in the '80s and '90s, before Cunningham there was Bobby Douglass in the '60s and '70s, before Douglass there was Bob Hoernschemeyer in the '50s and '60s, all of whom averaged a bit over 40 yards per game at their sustained best.
There's a good chance Jackson is better than all of these players. But 50% better than Vick? Nearly 100% better than Cunningham? It's more likely by far that his production this season is not entirely indicative of his "true talent level", that he's had some favorable circumstances working to his advantage and those circumstances will wash out over time.
Prolific rushing quarterbacks have been on the rise over the past decade. Beyond Jackson, we've seen Cam Newton, Russell Wilson, Robert Griffin III III, and Colin Kaepernick put up huge rushing totals. Several other young quarterbacks such as Deshaun Watson, Josh Allen, and Kyler Murray are all prolific runners, too.
Indeed, the league appears to be at an inflection point for quarterback rushing. There are plenty of reasons to believe Lamar Jackson is the right quarterback in the right place at the right time to buck history and evade regression like just another would-be tackler.
There are always reasons to believe that this time is different, and yet this time so rarely is. Maybe Lamar Jackson is about to embark on a long string of 1,000-yard rushing campaigns. But this true believer has been burned in the past, so I'll be over here quietly betting the "under".