Next Gen Stats Draft Model can predict prospects' pro success
The NFL's Next Gen Stats team set out to add objectivity to evaluating draft prospects, a typically subjective process. The staff recently came up with the Next Gen Stats Draft Model, a refined and more predictive version of the 2019 model. The model aims to answer these key questions about how prospects will fare at the next level:
- How athletic is a player based on measurable drills at the NFL Scouting Combine?
- How productive was a player in college based on on-field performance?
- How big is a player relative to other players in their position group?
- How will a player perform in the NFL based on their athleticism, production and size profiles?
How the draft model works
As NFL.com’s Mike Band explains, "It can be helpful to think of the model as several different position-specific models, tailored specifically to distinguish between the traits needed for success at each position group, with players separated into the following positions: quarterback (QB), running back (RB), wide receiver (WR), tight end (TE), offensive tackle (T), guard (G), center (C), edge (ED), defensive tackle (DT), linebacker (LB), cornerback (CB) and safety (S).
"The models use a decision-tree-based algorithm called XGBoost to predict the likelihood that the player will become an NFL starter or Pro Bowler within the first three seasons of his career. The models are trained on historical data from the NFL combine (since 2003) and on-field college statistics (since 2005), with rigorous feature selection techniques applied to each position-specific model. The resulting probabilities are converted into composite scores for each player -- representing athleticism, production, size and final overall score -- driven by the key traits that best predict NFL success."
Harvard student determines ball carrier's "effective acceleration" was most important for estimating yards gained on a handoff play
The second annual Big Data Bowl focused on predicting the outcomes of rushing plays during the 2019 season. Participants were provided with the NFL's Next Gen Stats, including speed, direction, and location information for all 22 players on the field at the moment a ball carrier receives the ball, and were tasked with predicting where the ball carrier would end up.
This year, six collegiate finalists presented their work to NFL club analytics staff at the NFL Combine in Indianapolis. Three honorable mention papers were also named. Below is a summary of each presentation and a link to the complete entry.
GRAND FINALIST: Matt Ploenzke (Harvard)
Ploenzke used Next Gen Stats data to build interpretable model inputs based upon football-specific domain knowledge, ultimately highlighting the importance of ball carrier downfield acceleration and unblocked tackler distance and spacing.
Key stat: Among roughly 40 input variables, a ball carrier's "effective acceleration" was the most important for estimating yards gained on a handoff play.
Kellin Rumsey, Brandon DeFlon (University of New Mexico)
The battle between blocker and defender is often decided by leverage. In this paper, Rumsey and DeFlon define offensive and defensive leverage, and study the statistical properties of these metrics.
Key stat: In the first six weeks of the 2017 season, Blake Martinez (Green Bay Packers) was among the league's best at generating defensive leverage. Martinez finished the season with the third-most solo tackles (96).
Graham Pash, Walker Powell (NC State)
Pash and Powell used kinematic data such as player positions and velocity to determine zones of control for both the offensive and defensive teams at the time of the handoff. These zones of control predict the probabilities of yards lost or gained and quantifies the risk involved with plays.
Key stat: Robert Woods (Los Angeles Rams) and Raheem Mostert (San Francisco 49ers) outperformed the model predictions the most, averaging nearly three more yards than predicted over the 2017 and 2018 seasons.
Namrata Ray, Jugal Marfatia (Washington State University)
Ray and Marfatia measured the open space of the rusher at three time intervals — handoff, after a half-second, and after one second — to understand the association between open space and yards gained. Results indicated that the difference in the open space between the time of handoff and after a half-second or full second was a strong predictor of the number of yards gained.
Key Stat: Yards gained by the rusher increases by four yards on average for every one percent increase in the additional open area created within a half-second of the handoff.
Alex Stern (University of Virginia)
Using an advanced machine learning algorithm, Stern assessed the value of initial space created for the ball carrier by the offensive line. That space was then linked to linemen grades, and standardized by accounting for the number of defensive backs, linebackers and defensive linemen on the play, the defensive strength of the opposing team, and the running direction of the running back.
Key Stat: In 2018, New Orleans Saints center Max Unger received a top five grade according to Stern's space grade rank for centers, despite being the 31st-graded run blocker by Pro Football Focus.
Caio Brighenti (Colgate University)
Brighenti computed each team's control of the field at the moment of the handoff to predict the outcome of rushes. Brighenti found that offensive control at the running back's expected point of intersection with the line of scrimmage was the most important predictor of run yardage.
Key Stat: The critical factor separating successful and unsuccessful plays is ownership of the run gap at the line of scrimmage — even on plays gaining more than 10 yards, the difference in field control past the line of scrimmage was almost negligible.
HONORABLE MENTION PAPERS
Bryant Davis (University of Florida)
Aaron Kruchten (Carnegie Mellon)
Lucas Wu, Dani Chu, Matthew Reyers (Simon Fraser)
Check out the best photos from behind the scenes at the NFL Combine.