Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
The Scorecard
Returning readers, you know how this works by now, but for new readers here's the deal. Every week I take a look at a specific statistic that is prone to regression and identify high and low outliers in that statistic, and then I wave my hands in the air and shout “regression!”
But since predictions aren't any fun without someone holding your feet to the fire afterward, I don't stop there. I lump all of the high outliers into Group A. I lump all of the low outliers into Group B. I verify that Group A is outperforming Group B. And then I predict that Group B will outperform Group A over the next four weeks.
I don't get to pick and choose my groups, beyond being free to pick and choose what statistics are especially prone to regression. If I'm tracking yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions.
And then, groups chosen and predictions made, I track my progress. That's this.
In Week 2, I outlined what regression was, what it wasn't, and how it worked. No prediction was made.
In Week 3, I listed running backs with exceptionally high and low yards per carry averages and predicted that the low-ypc cohort would outperform the high-ypc cohort over the next four weeks.
In Week 4, I looked at receivers who were overperforming and underperforming in yards per target and predicted that the underperformers would outperform the overperformers over the next four weeks.
In Week 5, I compared the predictive accuracy of in-season results to the predictive accuracy of preseason ADP. Outside of a general prediction that players would tend to regress in the direction of their preseason ADP, no specific prediction was made.
In Week 6, I looked at quarterbacks who were throwing too many or too few touchdowns given the amount of passing yards they were accumulating, then predicted that the underperformers would score more fantasy points than the overperformers going forward.
In Week 7, I looked at receivers who were catching too many or too few touchdowns based on their yardage total, then predicted that the underperformers would score more fantasy points than the overperformers going forward.
In Week 8, I revisited yards per carry, again predicting that the high-carry, low-ypc group would outrush the low-carry, high-ypc group going forward.
In Week 9, I went back to yard to touchdown ratios, predicting that the low-touchdown group would close the gap substantially with the high-touchdown group going forward.
In Week 10, I discussed the pitfalls of predicting regression over 4-week windows. No specific prediction was made.
In Week 11, I once more delved into the theory behind regression and highlighted the importance of not cherrypicking which players are “too good” or “not good enough” to regress.
In Week 12, I took one more shot at touchdown regression for quarterbacks, predicting that the low-touchdown cohort would close the gap with the high-touchdown cohort going forward.
In Week 13, I decided to close the predictions out with a bang, sorting the top 100 skill position players in yards per touchdown ratio and predicting that the third with the fewest touchdowns would outperform the third with the most touchdowns going forward.
Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining |
---|---|---|---|
yards per carry | Group A had 60% more rushing yards per game | Group B has 16% more rushing yards per game | None (Win!) |
yards per target | Group A had 16% more receiving yards per game | Group B has 11% more receiving yards per game | None (Win!) |
passing yards per touchdown | Group A had 13% more fantasy points per game | Group A has 17% more fantasy points per game | None (Loss) |
receiving yards per touchdown | Group A had 28% more fantasy points per game | Group B has 1% more fantasy points per game | None (Win!) |
yards per carry | Group A had 25% more fantasy points per game | Group B has 16% more fantasy points per game | None (Win!) |
rushing yards per touchdown | Group A had 21% more fantasy points per game | Group B has 8% more fantasy points per game | None (Win!) |
passing yards per touchdown | Group A had 14% more fantasy points per game | Group A has 27% more fantasy points per game | 2 |
yards from scrimmage per touchdown | Group A had 7% more fantasy points per game | Group B has 2% more fantasy points per game | 3 |
There's not a whole lot left to say about the predictions at this point, so here are three quick takeaways.
(A) quarterbacks continue to defy regression this season, (which means this season is weird and not that regression doesn't apply to quarterbacks).
(B) A 9% swing (from 7% in A's favor to 2% in B's favor) is more impressive in a group of nearly 70 total players than it is in a group of 6-10 players.
(C) Again, I'd like to reiterate the importance of not cherry-picking who you think is going to regress and who isn't. Group A's top five performers last week were Nelson Agholor, Marshawn Lynch, Cameron Brate, Evan Engram, and Jermaine Kearse.
Now on to the analysis.
Nobody is Special
A few days ago on Twitter, Rotoworld's Rich Hribar mentioned that Rex Burkhead and Alvin Kamara were currently averaging a touchdown for every 13.2 and 13.3 touches, respectively. I've talked about yard-to-touchdown ratio a lot in this space, but touches-to-touchdown ratio is a very close cousin.
The upshot is that touches are very stable and predictable from week to week, while touchdowns are wildly unpredictable from week to week. Players with extremely high or extremely low ratios are definitely overperforming/underperforming and are classic regression candidates.
Think back to the original Regression Alert column this year when I discussed what regression was and how it works. All players have a “true” touches-to-touchdowns ratio, but because of variance, they'll deviate from that ratio on a play-to-play basis. The smaller the sample size we're looking at, the more it's possible for a player to deviate from their “true mean” production level.
Because we know variance is happening, any time we see extremely high or extremely low values, our default hypothesis should be that those values are the result of variance and not that the player in question just has a freakishly high or freakishly low “true level” of production.
To make this point on Twitter, I pointed out some career touch-to-touchdown ratios.
I don't want to imply this is unsustainable, but since the merger the best RBs with fewer than 20 career touches per TD are maybe Brandon Jacobs, Larry Kinnebrew, and Don McCauley. https://t.co/Tq07hvVvCe
— Adam Harstad (@AdamHarstad) December 5, 2017
Some notable names in terms of career touches per touchdown.
— Adam Harstad (@AdamHarstad) December 5, 2017
21.45 - Shaun
22.54 - Priest
22.56 - Ezekiel
23.44 - LaDainian
25.44 - Arian
26.49 - Marshall
26.66 - Jamaal
27.19 - Adrian
27.76 - Maurice
Those tweets led to an interesting discussion about Alvin Kamara. Kamara has been sensational and watching him enervate opposing defenses has been a true joy, in the same way that watching also-guaranteed-to-regress rookie phenom Deshaun Watson was a true joy.
Kamara is quite obviously special. But is he special? By this, I mean is he somehow good enough and fast enough to outrun the regressive forces that have chased down everyone else on a hot streak?
Or more interestingly, is there meaning in his performance that doesn't exist in Burkhead's?
After all, the beauty of the touch-to-touchdown ratio is that touches hold a lot of predictive power and touchdowns hold very little predictive power, so outliers become easy regression candidates. But what if there exist a special class of players for whom touchdowns do hold strong predictive power?
If such a class exists, and if Kamara truly does belong to that class, then calling for regression is actually wrongheaded. His touch-to-touchdown ratio might decline, but it might not be because his touchdowns decline to fall in line with his touches, (as is their wont)... but instead because his touches rise to fall in line with his touchdowns.
As a potential example of this theory, we have David Johnson. Johnson was similarly electric as a rookie, scoring 12 touchdowns on 161 offensive touches, a 13.4 touch-to-touchdown ratio almost identical to Kamara's. In 2016, Johnson's touch-to-touchdown ratio regressed a bit to 18.7... but not because Johnson was scoring fewer touchdowns.
Instead, Johnson's prolific touchdown scoring as a rookie was a bellwether for his greatness, and Arizona greatly expanded his workload heading into year 2. Could Kamara be in for a similar fate?
It's possible. But at the same time, I've looked at a lot of hypotheses regarding touchdown production and pretty much all of them have come back negative. Perhaps there's a special class of players for whom touchdowns are predictive and I just haven't found them yet, but I think given the evidence to date our default hypothesis should be “no one is special”, with the burden of proof falling on anyone who wishes to assert otherwise.
Consider, for instance, Rob Gronkowski. If anyone is a special touchdown-scorer, it's certainly him; I have never in my life seen a more deadly red-zone weapon. In my informed opinion as an amateur football historian, Gronkowski is on a snap-to-snap basis the most dominant tight end to ever play.
Here is Rob Gronkowski's career yard-to-touchdown ratio after every season of his career:
54.6
67.0
68.3
75.7
79.7
84.2
88.4
91.4
Remember, if you will, that when I first introduced the concept I mentioned that players tended to cluster into a relatively tight band for their careers, with the Hall-of-Fame-caliber touchdown threats settling in right around 100 receiving yards for every receiving touchdown.
And indeed, Rob Gronkowski has been inexorably pulled towards that lodestone as his career has progressed. After those blistering first three seasons, Gronkowski has “only” averaged a touchdown for every 115 yards in the five years since.
And to be clear, the change is not a result of any difference in his yardage, which has remained relatively constant. After a low-usage rookie year that is clearly an outlier, Gronkowski averaged 78.4 yards per game in years 2-3. He averages 76.4 yards per game since. Again, his yards stayed constant, as yards tend to do. Instead, his touchdowns fell from 1.07 per game in years 2 and 3 to 0.66 per game in years 4-8.
Okay, so now we have David Johnson as anecdotal evidence that sometimes touchdown production might hold predictive power. We also have Rob Gronkowski has a counter-anecdote that nobody is “special” when it comes to touchdowns and regression comes for us all. Maybe it's time to leave the realm of anecdote and enter the realm of fact.
Footballguys has a suite of data queriers available to subscribers that allow you to perform really powerful historical searches. Using the Historical Data Dominator, I pulled up a list of rookie running backs with at least 100 touches sorted by the fewest touches per touchdown; you can see the full list for yourself here.
Now a lot of these guys were short-yardage backs and fullbacks who clearly aren't good points of comparison for Alvin Kamara. You know I'm not a fan of yards per carry as a statistic, but removing everyone with fewer than 4.0 should weed out the guys who just scored a ton of goal-line plunges. This leaves us with 22 names.
At the same time, if we're being fair, several players on that list were unquestioned workhorses and also not really a great point of comparison for Kamara, especially if the hypothesis is that a stellar per-touch production level will lead to an increase in touches going forward; Marcus Allen might have been a per-touch star as a rookie, but it's not like there was much room for him to improve on his 22 touches per game.
Allen, Fred Taylor, Clinton Portis, and Adrian Peterson all averaged at least 18 touches per game as rookies; removing them from the sample leaves us with 17 players plus Kamara as points of comparison.
Charlie Smith is a false positive. The Historical Data Dominator is returning his sophomore season; as a rookie, he had 1 touch for 28 yards. Ickey Woods is a good comparison, but was injured in the second game of his sophomore year, so he's not a useful data point for how these backs do when healthy.
Finally, between conditioning issues and suspensions, Karlos Williams never played another down of football after an exciting rookie season. Removing him from the sample inflates Kamara's comps, but seems fair.
The remaining 15-player sample averaged 13.3 touches per game, 79.4 yards per game, and 13.3 touches per touchdown. Kamara averages 12.2, 101.7, and 13.3, respectively. Obviously the yardage from the comps is a little low, (owing to Kamara's insane yard-per-touch average, which is a topic for another column), but all in all, this seems like a very fair set of comparisons.
In short, this is a good list of rookie running backs who were productive and electric despite a part-time role, piling up touchdowns at an unsustainable rate. And yes, the list of comparisons includes David Johnson, as well as Tony Dorsett, Franco Harris, Roger Craig, Herschel Walker, Gale Sayers, Cookie Gilchrist, and Maurice Jones-Drew. It's a very strong list that is littered with Hall of Famers and All-Pros, and it speaks highly of Kamara's value going forward.
But the question isn't “are players who produce in a manner similar to Kamara's good players?” We should have already suspected they would be, and there's little question that Kamara himself is good.
The question should be “are players who produce in a manner similar to Kamara's uniquely resistant to the forces of regression?” For players in this group, do their touchdowns give us more information or their touches give us less information than is typical? So I looked at the fourteen remaining players and compared their sophomore campaign to their rookie campaign in terms of touches per game and touches per touchdown. Here's the raw data:
Player | Touches / Game | Year 2 T/G | Increase | Relative Total | Touches / TD | Year 2 T/TD |
---|---|---|---|---|---|---|
Cookie Gilchrist | 17.0 | 18.3 | 1.3 | 108% | 15.9 | 18.3 |
Tony Dorsett | 16.9 | 20.4 | 3.5 | 121% | 18.2 | 36.3 |
Abner Haynes | 15.1 | 15.2 | 0.1 | 101% | 17.6 | 17.8 |
Franco Harris | 14.9 | 16.5 | 1.6 | 111% | 19.0 | 66 |
Herschel Walker | 14.2 | 22.4 | 8.2 | 158% | 16.2 | 33.6 |
Roger Craig | 14.0 | 14.1 | 0.1 | 101% | 18.7 | 22.6 |
Gale Sayers | 13.9 | 18.8 | 4.9 | 135% | 9.8 | 26.3 |
Curtis Dickey | 13.4 | 13.4 | 0.0 | 100% | 15.5 | 20.1 |
Maurice Jones-Drew | 13.3 | 13.8 | 0.6 | 104% | 14.1 | 23 |
Jonathan Stewart | 11.9 | 14.9 | 3.0 | 125% | 19.1 | 21.7 |
Marion Butts | 11.8 | 20.1 | 8.3 | 170% | 19.7 | 35.1 |
Paul Lowe | 11.4 | 13.7 | 2.3 | 121% | 15.9 | 21.3 |
David Johnson | 10.1 | 23.3 | 13.2 | 232% | 13.4 | 18.7 |
Isaiah Crowell | 9.8 | 12.8 | 3.0 | 130% | 19.6 | 40.8 |
Average | 13.3 | 17.0 | 3.6 | 123% | 13.3 | 25.1 |
(Note: a simple average is not appropriate for aggregating rate statistics; for the average percentage increase and the average touch-per-touchdown ratio I have used the harmonic mean, an average designed specifically to compare rates.)
Now, every single one of these backs except Curtis Dickey increased their touches per game average in year 2, (and Dickey at least stayed constant), but that is itself the forces of regression in action; we should expect talented, low-usage backs to become higher usage over time.
Is there evidence that the strong touchdown production predicted a larger jump than we would have naively assumed? Not really. David Johnson is probably the most memorable comparison just because he was the most recent, but this should make clear that he's also probably the least representative; his 13.2 touches per game jump and 132% relative increase in workload were both the largest values by a substantial margin.
There's no reason to believe that Johnson is a better point of comparison than, say, Maurice Jones-Drew or Franco Harris, two sensational backs who remained mired in a timeshare in year 2 and saw their role barely increase.
Collectively, Kamara's comps saw an increase of 3.6 touches per game, or 23% of total touches. If Kamara matches those marks in 2018, he'll average 15.8 touches or 15.0 touches, respectively. That is around 240-250 touches over a full season.
(Meanwhile, because of regression to the mean on yards per touch, the comparisons only saw their yards per game increase by 10%; with even more outlandish yard-per-touch averages, it would not be the slightest bit surprising to see Kamara's yards per game average decline even if he does hit those increases in touches per game.)
So, to sum it all up... Alvin Kamara is in many respects having a unique season in NFL history, but there are still some decent points of comparison. Those points of comparison did, indeed, see their touchdown rate fall substantially in their second year, (scoring 53% as many touchdowns per touch), and overall it wasn't the result of a dramatic increase in workload. For cases like Kamara, touchdowns don't gain additional predictive power, nor do touches lose the predictive power they have.
With that said, players who have rookie seasons like Alvin Kamara's are by and large an extremely impressive group, and there's no reason for anything but optimism about his career prospects. Alvin Kamara may not be special in the sense that he's somehow immune to the forces of regression. But that doesn't mean he's not special.