How Are Football Match Stats Collected? From the Stadium to Your Screen
Football match stats reach your screen through a chain of human and AI data-gathering. We explain the Opta Forza process, the role of human analysts, automated tracking, and ethical questions.
Football match stats reach your screen through a hybrid pipeline of human analysts and automated tracking. The dominant industry process is Opta Forza β three trained analysts in front of monitors at every Premier League match, hand-coding every on-ball action in real time. Tracking systems (StatsBomb 360, Second Spectrum, Skillcorner) add the off-ball data via computer vision. The combined data flows to a central hub and out to broadcasters, apps, and analytics platforms in under 30 seconds.
The human side β Opta Forza
For every Premier League match, three Opta-trained analysts work the live data feed:
- Coder 1 β possession + passing. Codes every pass: who, to whom, success/failure, body part, pass length category.
- Coder 2 β defensive actions. Codes every tackle, interception, clearance, foul.
- Coder 3 β shots + complex events. Codes shots (with location coordinates, body part, xG inputs), set-piece deliveries, and edge cases.
- Quality assurance. A fourth analyst reviews edge cases in real time and corrects any miscodes flagged by automated checks.
A 90-minute Premier League match generates 2,000-3,000 hand-coded events. Three trained Opta coders working in parallel keeps up with the live feed.
The automated side β tracking systems
Two major tracking technologies provide the off-ball data:
- Camera-based tracking (Second Spectrum, Hawk-Eye). Multiple high-resolution cameras around the stadium track every player and the ball at 25 frames per second. Computer vision converts the video into x,y coordinates.
- Sensor-based tracking (Catapult, STATSports). GPS pods worn by players record position. More common in training; less in matches.
- StatsBomb 360. Combines camera tracking with their event data β provides player positions at the moment of every key on-ball event.
How the data is verified
Three layers of verification:
- Real-time peer review. The fourth analyst on each match catches flagged issues during play.
- Post-match scrub. Within 1-2 hours of full-time, a senior analyst reviews the entire match and corrects any errors.
- Cross-comparison. Different data providers (Opta, StatsBomb, Wyscout) collect overlapping events. When their numbers disagree, both flag it for review.
Where the data goes
Match data flows to multiple endpoints:
- Broadcasters. Sky, TNT, BT Sport, etc. receive live event data for graphics overlays during matches.
- Football apps. Sofascore, Fotmob, Fbref all license event data from Opta or its competitors.
- Clubs. Use the data for opposition analysis, recruitment, training planning.
- Bookmakers. Live in-play odds adjust based on event data within seconds of each event.
- Public + paid analytics. FBref publishes free; StatsBomb and Opta Pro charge for deeper access.
How long does it take?
Latency between match action and data on your screen is typically:
- Live broadcast graphics. ~1-2 seconds.
- Sofascore / Fotmob match views. ~5-15 seconds depending on event type.
- Bookmaker odds updates. ~3-10 seconds depending on the bet type.
- FBref deep stats. ~30 minutes after full-time (real-time isn't their priority).
- Cross-checked / archived data. 24-48 hours for fully reviewed canonical figures.
Cost and ethical considerations
Three industry topics:
- Cost. A Premier League match's data collection costs ~Β£10,000-30,000 in human-analyst labour and tracking equipment. The total Premier League match-coverage market is worth Β£100m+ annually.
- Player consent. Players are tracked extensively; data is used in performance reviews, contract negotiations, and recruitment models. Pro Football Association (PFA) has been pushing for clearer player consent and data-use rules.
- Lower-tier coverage gaps. Premier League and Championship are fully covered. Lower English tiers, women's football below WSL, and most non-European leagues have partial or no coverage. This affects scouting equity.
How to use the data well
Three principles for fans and amateur analysts:
- Trust archived numbers more than live numbers. Live event data has higher error rates; the canonical figures (24-48 hours post-match) are more reliable.
- Cross-reference between providers. If an Opta number and a StatsBomb number disagree, the disagreement is usually a methodology difference, not an error.
- Check the context. A "20 shots" stat is meaningless without xG context; a "70% possession" stat is meaningless without progression context.
Frequently asked questions
- How are football match stats collected?
- Through a hybrid pipeline. Opta Forza is the dominant process: three trained human analysts at every Premier League match hand-code on-ball events in real time. Tracking systems (StatsBomb 360, Second Spectrum, Hawk-Eye) add off-ball data via computer vision. The combined data flows to a central hub and out to broadcasters, apps, and analytics platforms in under 30 seconds.
- Are football stats collected by humans or AI?
- Both. Humans hand-code on-ball events (passes, tackles, shots) β three trained Opta-coders work each match in parallel. Automated computer-vision tracking systems collect off-ball player positions at 25 frames per second. The combination produces a complete dataset. Pure AI event-coding is being trialled but isn't yet reliable enough at the elite level.
- How long until match stats appear in apps?
- Live broadcast graphics: 1-2 seconds. Sofascore / Fotmob match views: 5-15 seconds. Bookmaker odds updates: 3-10 seconds. FBref full deep stats: ~30 minutes after full-time. Fully reviewed canonical figures: 24-48 hours. Live data has slightly higher error rates; archived data is the most reliable.
- Who pays for football match data collection?
- Match-data collection is funded by the Premier League and other major leagues, who own the broadcast rights and license event data to broadcasters, apps, and analytics platforms. Cost per Premier League match is ~Β£10,000-30,000 in analyst labour and equipment. The total Premier League match-coverage market is worth Β£100m+ annually.
References
- Opta Sports β How We Collect Data β Opta
- StatsBomb 360 β Methodology β StatsBomb
- Second Spectrum β Computer Vision Tracking β Second Spectrum
- Pro Football Association β Data Use Position β PFA
Part of pillar
Data and Systems
See every article in this knowledge pillar β
Related
Reviewed by a KiqIQ editor before publication. Spotted an error? Email editor@kiqiq.com β we follow our Corrections Policy.