
Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes, I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions. On a case-by-case basis, it's easy to find reasons why any given player is going to buck the trend and sustain production. So I constrain myself and remove my ability to rationalize on a case-by-case basis.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared. Here's a similar list from 2017.
The Scorecard
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I explained why touchdowns follow yards, (but yards don't follow back), and predicted that the players with the fewest touchdowns per yard gained would outscore the players with the most touchdowns per yard gained going forward.
In Week 5, I talked about how preseason expectations still held as much predictive power as performance through four weeks. No specific prediction was made.
In Week 6, I talked about why quarterbacks tended to regress less than other positions but nevertheless predicted that Patrick Mahomes II would somehow manage to get even better and score ten touchdowns over the next four weeks.
In Week 7, I talked about why watching the game and forming opinions about players makes it harder to trust the cold hard numbers when the time comes to put our chips on the table. (I did not recommend against watching football; football is wonderful and should be enjoyed to its fullest.)
In Week 8, I discussed how yard-to-touchdown ratios can be applied to tight ends but the players most likely to regress positively were already the top performers at the position. I made a novel prediction to try to overcome this quandary.
In Week 9, I discussed several of the challenges in predicting regression for wide receiver "efficiency" stats such as yards per target. No specific prediction was made.
In Week 10, I proposed a "leaderboard test" to quickly tell whether a statistic was noisy (and more prone to regression) or stable (and less prone to regression). I illustrated this test in action and made another prediction that yards per carry would regress.
In Week 11, I mentioned that many unexpected things were at the mercy of regression to the mean, highlighting how the average age of players at a given position tends to regress over time as incoming talent ebbs and flows.
In Week 12, I predicted that because players regress, and units are made up of players, units should regress, too. I identified the top five offenses, bottom five offenses, top five defenses, and bottom five defenses, and predicted that after four weeks those twenty units would collectively be less "extreme" (defined as closer to league average). Because offense tends to be more stable than defense, I added a bonus prediction that the defenses would regress more than the offenses.
In Week 13, I delved into how interceptions were the only quarterback stat that is mostly noise and predicted that the most interception-prone quarterbacks in the league (yes, including Jameis Winston) would start throwing fewer interceptions than the least interception-prone quarterbacks in the league.
In Week 14, I talked about how big of a role schedule luck plays in fantasy football outcomes and how, as luck in its purest form, it regresses mercilessly.
In Week 15, I presented a brief history of players who had once been considered regression-proof, demonstrated how much they'd regressed, and called into question whether any current players were actually regression-proof as a result.
In Week 16, I went over several methods of determining how much luck is required to win a fantasy championship.
Statistic For Regression
|
Performance Before Prediction
|
Performance Since Prediction
|
Weeks Remaining
|
Yards per Carry
|
Group A had 20% more rushing yards per game
|
Group B has 30% more rushing yards per game
|
Success!
|
Yard:Touchdown Ratio
|
Group A had 23% more points per game
|
Group B has 47% more points per game
|
Success!
|
Mahomes averaged 2.2 touchdowns per game
|
Mahomes averages 2.0 touchdowns per game
|
Failure
|
|
Yard:Touchdown Ratios
|
Group B had 76% more point per game
|
Group B has 146% more points per game
|
Success!
|
Mahomes TDs Redux
|
Mahomes averaged 2.2 touchdowns per game
|
Mahomes averages 2.3 touchdowns per game
|
Failure
|
Yards per Carry Redux
|
Group A had 22% more rushing yards per game
|
Group B has 23% more rushing yards per game
|
Success!
|
"Extreme" performance
|
"Extreme" units were ~6.4 ppg from average
|
"Extreme" units are 89% as "extreme"
|
Success!
|
Defense vs. Offense
|
|
Defenses regressed 12% more than Offenses
|
Success!
|
Team Interceptions
|
Group A had 87% as many interceptions
|
Group B has 86% as many interceptions
|
Success!
|
Our low-interception teams almost staged a Christmas miracle and went the entire week without throwing a single pick. DeShaun Watson's interception for Houston was their only pick of the week until Monday night when Green Bay and Minnesota (who ranked 1st and 3rd in interceptions heading into the game) threw one each. Still, Group A's three interceptions in Week 16 was ten interceptions lower than their previous weekly best.
The problem? As I noted last week, Group B's lead was simply insurmountable. Group A would have needed not only to outperform their best week by ten interceptions but for Group B to underperform their worst week by an additional seven interceptions. In the end, our low-interception teams maintained a slightly lower interception rate and our high-interception teams maintained a slightly higher interception rate, but both rates regressed so strongly that Group A threw more interceptions just by dint of the number of teams involved.
One Final Scorecard
In an annual end-of-year tradition, I want to take one last look back at all of our predictions this season. I frequently say that regression predictions work best over longer timelines, but I use a 4-week window just to keep myself accountable. The end of the year gives us a chance to look back along a slightly longer timeline and see how things played out. I will focus on evaluating the prediction itself, as well as the underlying mechanism that led to the prediction.
(Evaluating the prediction is about accountability to the fantasy community. Did people who make moves based on my predictions profit? Evaluating the mechanism is about accountability to myself. Is my understanding of the underlying mechanics sound and likely to result in profitable predictions in the future? Both aspects are important. Being right is nice, but being right for the wrong reasons is not sustainable going forward.)
Statistic For Regression
|
Performance Before Prediction
|
Performance Since Prediction
|
Prediction
|
Mechanism
|
Yards per Carry
|
Group A had 20% more rushing yards per game
|
Group B has 19% more rushing yards per game
|
Success!
|
Success!
|
Yard:Touchdown Ratio
|
Group A had 23% more points per game
|
Group B has 44% more points per game
|
Success!
|
Success!
|
Mahomes averaged 2.2 touchdowns per game
|
Mahomes averages 2.2 touchdowns per game
|
Failure
|
Mixed
|
|
Yard:Touchdown Ratios
|
Group B had 76% more point per game
|
Group B has 216% more points per game
|
Success!
|
Success!
|
Yards per Carry Redux
|
Group A had 22% more rushing yards per game
|
Group B has 20% more rushing yards per game
|
Success!
|
Success!
|
"Extreme" performance
|
"Extreme" units were ~6.4 ppg from average
|
"Extreme" units are 83% as "extreme"
|
Success!
|
Success!
|
Defense vs. Offense
|
|
Defenses regressed 12% more than Offenses
|
Success!
|
Success!
|
Team Interceptions
|
Group A had 87% as many interceptions
|
Group B has 86% as many interceptions
|
Success!
|
Success!
|
Week 3: Yards per Carry
Our first yard per carry prediction aimed to pit backs with medium-to-low volume and a high per-carry average against backs with medium-to-high volume and a low per-carry average under the assumption that the volume would remain the same and any differences in per-carry average would vanish. To that end: the top six backs in total carries since Week 3 came from our Group B (high-volume) backs.
And now for my absolute favorite part of every one of these predictions: at the time of the prediction, Group A backs averaged 6.45 yards per carry and Group B backs averaged 4.14 yards per carry. Since the prediction, Group A backs average 4.41 yards per carry and Group B backs average 4.45 yards per carry. The prediction was right, and it was right for exactly the reason I said it would be. Yards per carry is not a thing, never was a thing, and never will be a thing.
Week 4: Yard-to-Touchdown Ratio
I compared receivers with fewer than 200 yards and 2 or more touchdowns (giving them by definition fewer than 100 yards gained per touchdown scored) against receivers with more than 200 yards and 1 or fewer touchdowns (giving them by definition more than 200 yards gained per touchdown scored). I made my prediction on three points:
- Yards gained are predictive, so Group B should continue outperforming in that regard.
- Differences in yard to touchdown ratios can be meaningful, so Group A could very well continue to score more touchdowns per yard gained than Group B, but
- any ratios outside of the 100-200 range were unsustainable over a longer timeline, so both groups would regress to within that range.
That's exactly what happened. Group B maintained its yardage lead (increasing slightly from 42% more yards per game to 55% more yards per game), but their yard-to-touchdown ratio fell from 405 down to 179 while Group A's yard-to-touchdown ratio rose from 67 to 136. This prediction was a smashing success on every level.
Week 6: Patrick Mahomes II' Touchdowns
I went back to yard-to-touchdown ratios, this time positing that Patrick Mahomes II' was too low given his prodigious talent. Again, our prediction was based on the idea that passing yardage was a stable predictor while passing touchdowns would tend to regress until they were more in line with expectations based on passing yardage. Specifically, based on Mahomes' talent, we expected one passing touchdown for every 130-140 passing yards.
At the time of the prediction, Mahomes was averaging 366 yards per game. It turns out those passing yards were not predictive; pro-rating for missed time, Mahomes averaged just 279 yards per game going forward. A lot of this was driven by the fact that Mahomes was playing through injuries, which is why I like using bigger samples (so that the injuries even out a bit), but either way the prediction that passing yardage would remain stable was wrong.
On the other hand, Mahomes went from one touchdown for every 166 yards before the prediction to one touchdown for every 145 yards after the prediction. Given the small sample sizes, I consider that close enough to my predicted 130-140 yard per touchdown range to count it as a hit. So our prediction was wrong, but the underlying mechanism had mixed success.
Week 8: Tight End Yard-to-Touchdown Ratios
I wanted a prediction for the tight end position but was faced with a challenge: the players most likely to regress positively were the players who were already leading the position in scoring. To try to overcome this, I stacked the comparison group with twice as many players, forcing our ideal regression candidates to be twice as good to overcome that numbers advantage.
The mechanism was similar to above: yards per game would remain more or less constant, the high-touchdown cohort would likely continue scoring more touchdowns per yard than the low-touchdown cohort, but all yard-to-touchdown ratios would likely resolve to somewhere in the 100-200 range.
Once again, our prediction was 3-for-3. Both Group A and Group B averaged within 5 yards per game of their level at the time of the prediction, Group A scored touchdowns at a slightly higher rate than Group B, but Group A's ratio fell from one per 65 yards to one per 101 yards while Group B's ratio rose from one per 361 yards to one per 155 yards. Result: Group B didn't just double Group A's per-game production, it more than tripled it.
Week 10: Yards per Carry Redux
You all know the drill by now. Group A's yard per carry average fell from 5.12 at the time of the prediction to 4.39 since. Group B's yard per carry average rose from 3.61 at the time of the prediction to 4.24 since. Group B didn't quite manage to pass Group A in yards per carry— though if Derrick Henry didn't miss one game to bye and another game to injury, they very well might have— but that's not the point.
Regression to the mean never predicts that the worst performers will somehow surpass the better performers, just that both the best and worst will move closer to the middle. For a quality statistic that tells us something useful about the players in question (such as yard-to-touchdown ratios), the best should remain ahead of the worst even if the gap narrows dramatically. On the other end of the spectrum, for a statistic that is functionally indistinguishable from a random number generator (such as yards per carry), the worst should wind up ahead of the best roughly 50% of the time. Which is about what we see on these predictions.
Yards per carry is... well, you know how it goes by now.
Week 11: Unit Regression
This prediction just wrapped up last week, so (unsurprisingly) there's little change. As a recap: the best and the worst teams in points scored or points allowed relative to expectation tend to perform less spectacularly going forward. Additionally, defenses tend to regress to the mean more strongly than offenses. Check and check.
There is one unique aspect of this prediction, though. For all of my predictions, I compare performance before the prediction to performance after the prediction. As a result, I wouldn't expect players to regress more strongly over a longer timeline. If I think Group B should average 20% more yards per game based on underlying fundamentals, then my expectation after 2 weeks would be for Group B to be 20% ahead, and my expectation after 8 weeks would also be for Group B to be 20% ahead.
The primary advantage of a longer timeline is that it reduces the variance around this expected mean. Again, if I think Group B should be 20% ahead, then after a week I might expect them to be ahead by 20% +/- 80% (lots of variance), while after ten weeks I might expect them to be 20% +/- 10% (lower variance). But the mean expectation doesn't change.
Because of the way I structured this particular prediction, however, I'm not comparing purely pre-prediction data against purely post-prediction data. I'm comparing purely pre-prediction data to data that's a mix of pre-prediction and post-prediction data. As a result, as we gain more post-prediction data, I do expect units to regress even more.
After 1 week, the most extreme units were 95% as extreme. After two weeks, they were 93% as extreme. After three weeks, they were down to 90%. After four weeks, 89%. Finally, with an additional week of observed data (with the post-prediction data making up an even larger share of the total sample), they fell to 83% as extreme. Again, this is entirely consistent with the underlying mechanism and serves as a nice empirical indicator that the prediction was sound.
Week 12: Interception Rates
This prediction just wrapped up this week and was already covered above.
A Scorecard of Scorecards
It's useful to look back on a season worth of predictions to see how well our process has fared. But Regression Alert has now run for three years, which gives us a chance to look back on three years worth of predictions for an even larger sample still.
In total, there have been 23 specific, trackable predictions made in the history of Regression Alert (counting the Patrick Mahomes II prediction twice). 18 of those 23 predictions were successful over the 4-week span, a success rate of 78%. (If we only count the Patrick Mahomes II prediction as wrong once, the success rate is 18 out of 22 or 82%.)
Tracking over the full season (instead of just the four weeks in question), 16 out of 22 predictions have been correct, a 73% success rate. This is lower because two predictions that were successful over the four weeks in question flipped to unsuccessful over a longer timeline. In one of them, several Group B starters in one of our yard per carry predictions were filling in for an injured or suspended starter and when the starter returned, their yards per game dropped to near zero and tanked the group average.
The other prediction that flipped from successful to unsuccessful was the result of selection bias and the way we average the data. The groups were small and most of the Group A backs missed time, but the few that stayed healthy were quite productive and therefore disproportionately impacted the average. I wrote more about the odd nature of the flip here.
Despite a slightly lower success rate of our predictions over the last three years, I still believe in the power of regression over longer timelines. Had I built the predictions for the longer spans I would have removed the elevated backups from the sample to avoid those issues in the first place. Most importantly, of all 22 predictions, all but Mahomes' touchdowns regressed in the right direction; a failure, in this case, doesn't mean Groups A and B didn't regress, it merely means they didn't regress quite enough for Group B to overcome Group A's starting advantage.
I appreciate you all joining me for the ride this season. I hope you've become a little bit more of a believer in the power of regression and that you've learned a few things (I certainly have) and maybe even turned a bit of a profit along the way. Hopefully, 2019 was a good year for you and 2020 will be better still. I'll see you back here then.