VAEP Football: 5 Things Every Analyst Needs to Know

Expected goals has become the most referenced metric in public football analysis, yet it assigns a value to only a fraction of the on-ball actions that occur during a match. VAEP was built specifically to close that gap, valuing every pass, carry, tackle, and shot in a single unified framework.

By David Findlay, Founder of KiqIQ.

Quick Answer: VAEP (Valuing Actions by Estimating Probabilities) is a machine learning metric in football analytics that assigns a numerical value to every on-ball action in a match by estimating how much that action changed the probability of scoring or conceding.

Definition: VAEP is a football analytics model developed by researchers at KU Leuven’s DTAI Sports Analytics Lab. It uses a machine learning classifier to estimate the change in scoring and conceding probability caused by each individual on-ball action, producing a single value for any pass, dribble, shot, tackle, or other action recorded in a match.

Key point: VAEP values actions by measuring probability change, not by counting outcomes. A pass that shifts the team’s scoring probability upwards by 0.03 is valued at +0.03 VAEP. A poor touch that increases the opponent’s chance of scoring is assigned a negative value, regardless of whether a goal follows.

Origins and Research Background

VAEP was developed by Tom Decroos, Lotte Bransen, Jan Van Haaren, and Jesse Davis at KU Leuven’s DTAI Sports Analytics Lab in Belgium. The metric was formalised in the paper “Actions Speak Louder than Goals: Valuing Player Actions in Soccer,” published on arXiv (1802.07127), and an extended version titled “VAEP: An Objective Approach to Valuing On-the-Ball Actions in Soccer” was presented at IJCAI 2020, the International Joint Conference on Artificial Intelligence.

The research team’s central argument was that football analytics had become disproportionately focused on shots and goals, which account for a small fraction of all on-ball events in a match. Their work proposed a framework that could quantify the contribution of all recorded actions using probability estimation rather than outcome counting.

How VAEP Is Calculated

The VAEP formula, as documented in the socceraction library published by KU Leuven, operates on two probability components:

For any action a:

VAEP(a) = ΔP_score(a) − ΔP_concede(a)

Where ΔP_score is the change in probability that the team will score within the next sequence of actions following action a, and ΔP_concede is the change in probability that the team will concede in the same window.

A positive VAEP value means the action increased the team’s goal-scoring probability or reduced their conceding probability. A negative value means the action moved probability in the opponent’s favour. The model estimates these probabilities using a gradient boosted binary classifier trained on historical match data, computing game state from the preceding sequence of actions.

Probability estimates use the game state before and after the action, where game state is represented by the three most recent on-ball actions. Features fed into the classifier include action type, outcome, spatial coordinates on the pitch, time elapsed, and match context including scoreline and home or away status.

What Is SPADL?

VAEP operates on top of SPADL, the Spatiotemporal Action Description Language. SPADL is a standardised data format, also developed at KU Leuven, that represents individual player actions in a consistent structure regardless of the original data provider. It records the action type, location, time, and outcome for each on-ball event in a match.

The standardisation is significant for analytical work. Raw event data from different providers uses different schemas and action taxonomies. SPADL normalises these into a single format, making it possible to train VAEP models across data sets from multiple suppliers without custom preprocessing for each source.

The socceraction Python library, released publicly by KU Leuven’s ML research group on GitHub and PyPI, implements both SPADL conversion and VAEP calculation. It accepts event data from multiple providers and outputs per-action VAEP values that can be aggregated at player or team level.

VAEP Versus Expected Goals

The comparison with xG is built into VAEP’s research framing. Expected goals measures the quality of shot attempts based on their location and characteristics. It assigns value only at the moment of the shot, leaving the passes, runs, and combinations that created the opportunity without any individual valuation.

VAEP extends the valuation to the full action sequence. The pass that switched play to create space, the dribble that broke a defensive line, the tackle that recovered possession, all are assigned a value in the same unit. This makes VAEP applicable for evaluating players whose contribution is primarily off-ball-adjacent: the holding midfielder who intercepts, the centre-back who carries the ball forward to trigger an attack.

The limitation that VAEP shares with xG is that both work on recorded, on-ball events. Actions without ball contact, positioning runs, pressing without winning the ball, and defensive shape are not captured by either framework. Tracking data-based metrics address some of these gaps, but VAEP operates on event data, which remains the more widely available data type.

5 Things Analysts Use VAEP For

1. Full-Match Player Contribution Scoring

Aggregating per-action VAEP values across all of a player’s actions in a match, or across a season, produces a single contribution score that covers all on-ball involvement. This is more complete than shot-based metrics for players who rarely appear in the box.

2. Action-Level Performance Auditing

Individual action VAEP values flag specific decisions within a match. A sequence of passes that each carry negative VAEP values can identify a player’s tendency to move the ball into lower-probability positions, even when those passes are technically completed.

3. Comparing Players Across Positions

Because VAEP uses the same value unit for all action types, it can be used to compare the aggregate contribution of a winger and a defensive midfielder in a way that shot-based metrics cannot. The comparison is imperfect, but the shared currency of probability change provides a common basis.

4. Scouting and Transfer Profiling

VAEP per 90 minutes allows analysts to compare players across competitions with different data coverage, provided the underlying event data is available for both contexts. It is particularly useful for identifying players in less-covered leagues whose action quality ranks highly relative to the actions recorded.

5. Team-Level Action Analysis

Summing VAEP values by team rather than player produces a match-level assessment of where probability shifted. A team that generates consistently positive VAEP on wide zones and negative VAEP through central positions is exhibiting a spatial pattern in where their effective actions are concentrated.

Limitations of VAEP

The metric carries documented limitations that analysts working with it should understand:

On-ball only: VAEP values recorded actions. Off-ball positioning, pressing without ball contact, and tactical shape are not captured. Defenders and holding midfielders who contribute primarily through positioning are underrepresented.
Data quality dependency: The model’s outputs are constrained by the accuracy of the underlying event data. Errors or inconsistencies in action coding by data providers propagate directly into VAEP values.
Context compression: Representing game state from the three preceding actions compresses significant tactical context. The same pass in minute 10 of a 0-0 match and minute 85 of a 2-0 lead carries very different meaning that the model only partially captures through match context features.
Training data scope: The model’s probability estimates are trained on historical match data. Performance in novel tactical situations, or in competitions with limited training data, may be less reliable.

The KiqIQ Angle

VAEP is one of the more rigorous frameworks available to analysts working with event data, precisely because it is rooted in peer-reviewed research rather than proprietary methodology. The formula is public. The library is open-source. The paper is accessible. That transparency is rare in football analytics, where most commercially deployed models operate as black boxes. The practical limitation is the same as for any event-data metric: it tells you what happened on the ball, in sequence, across 90 minutes. What happened off the ball, which is where most of football’s value creation and suppression occurs, is still largely invisible to VAEP. Analysts who use VAEP alongside tracking-based metrics and video context are working with it correctly. Those using it as a standalone player rating are extracting a fraction of what the data can actually support.

Frequently Asked Questions

What does VAEP stand for in football?

VAEP stands for Valuing Actions by Estimating Probabilities. It is a machine learning metric developed at KU Leuven’s DTAI Sports Analytics Lab that assigns a value to every on-ball action in a football match based on how much that action changed the probability of scoring or conceding.

Who created VAEP?

VAEP was developed by Tom Decroos, Lotte Bransen, Jan Van Haaren, and Jesse Davis at KU Leuven’s DTAI Sports Analytics Lab. The research was formalised in “Actions Speak Louder than Goals: Valuing Player Actions in Soccer” and presented in extended form at IJCAI 2020.

How is VAEP different from xG?

Expected goals values only shot attempts. VAEP values every on-ball action in a match, including passes, dribbles, tackles, and carries, using the same unit of measurement. This makes VAEP applicable for evaluating contributions from all positions, not only attacking players who take shots.

What is SPADL and how does it relate to VAEP?

SPADL (Spatiotemporal Action Description Language) is a standardised data format that represents individual player actions in a consistent structure. VAEP is calculated on top of SPADL-formatted data. Both are implemented in the open-source socceraction Python library, published by KU Leuven’s ML research group.

Where can I access the VAEP model?

The socceraction library, which implements VAEP and SPADL, is available as an open-source Python package on GitHub (ML-KULeuven/socceraction) and on PyPI. The library accepts event data from multiple providers and outputs per-action VAEP values.