Creating a Flowchart of
a Goal in Soccer/Football
By Aaron Nielsen
After the
recent OPTA pro forum, I posted some comments including my proposed paper - Opta Pro
Research Paper I got a decent amount of back and forth on my twitter
account @ENBSports including
links to other work. What I noticed from others is that the MCFC data released
in 2011-2012 by OPTA is having an influence with writers trying to take
advantage of the 211 columns of data. Personally the only thing I really looked
at when it came out were free kick shot opportunities - Direct
Free Kicks as data for the la liga was also available through Marca
magazine so I thought it created a good comparison piece of analysis.
My main
concern with the detail set is the amount of cost/time to tabulate this depth
of data and what can be gained from it. So in tabulating my own data I broke
down over time what I thought were the most valuable statistics so I'm now able
to cover over 60 leagues of data as have been tabulating data for over 20
years. Although now with more detailed statistics becoming a bigger part of the
everyday conversation including most recently Statsbomb.com - Expected
Goals, I decided to reopen my MCFC excel sheet and reexamine the data from
the 2011-2012 English Premier League season and see if I can find specific
information from it that I think could be used in understanding the game better
through statistical data.
I thought my
first attempt was using the data if I could work out a proper flowchart to a
goal. One reason for this is that one of my biggest issue in soccer is that
most people including statistician see shots (including missed and blocked
shots) as a positive stat. Its true almost all goals come from a shot via a
foot or head although they also come from a shot on target (on average around
30% of all shots) well most shots off target lead to a restart from a
goalkeeper and an end to the offensive possession. So in recording data of less
detail I've decided to only include shots on target as shots and ignore other
shot attempts as I feel shots in total gives a poor representation of the
players success assumed by this stat. Although since OPTA makes it available
I've decided to use it as part of this analysis in help breaking down each
offensive possession.
I start in
looking at the flowchart of a goal with offensive possession which can either
come from Open Play, Penalty, Free-Kick (Direct or In Direct), Corner, or a
Throw-In. All goals including own goals fit under each applicable category and
an opponent’s defensive turnover leads to an open play on offensive possession
even if the offensive position only includes a shot. OPTA doesn't record each
individual offensive possession in the stats but does show 1025 goals, 10891
total shots which led to 10496 Goalkeeper Distribution and 22555 clearances
that stats do not include other forms where we can assume lost of possession
but one could estimate in total there were about 40,000 offensive possession or
a goal in every 40 or so attempts at possessions.
OPTA does a
great job in analyzing penalties, free kicks and corners so I will start there.
First in terms of free kicks OPTA judged
that there were 1884 Free Kicks in a dangerous area not including penalties
from these free kick opportunities there were 854 that lead to a shot including
553 direct attempts on net that led to 29 direct goals or one in every 19.1
attempts. OPTA did record shots information but in terms of a flow chart it’s
difficult to analyze the play because we don't know of the play conclusion on a
direct attempt that did not lead to a goal. Of the 1291 free kicks passed
instead of direct shot 301 lead to shots via a key pass and 50 goals or a goal
in every 25.8 attempt although again we have no information on plays that
didn't lead to a goal. Meanwhile there were 100 penalties with 72 converted, 23
being saved and 5 were off target.
In terms of
Corners, OPTA has a total 4321 corners that lead to 129 goals inside the box or
a goal in every 33.5 attempt. Of these Corners 3496 were attempted in the box
with 1163 being "successful" which I assume means lead to a shot and
377 of those shots were on target. There were 663 short corners although OPTA
gives no detail on any further actions from these short corners. Meanwhile
throw-ins lead to 20 goals from inside the box, 67 shots on target and 184
shots off target although we don't have information on how many throw-in were
attempted in that zone or again what happen to the play on corners and
throw-ins outside of ones that lead to goals and shots.
Which leads
to open play opportunities, unfortunately with some of the data missing above
we can't assume all the open play data. The data shows there were 725 goals not
counted above so we can assume they were from open play and of these goals 577
goals were inside the box and 148 from outside. We can't breakdown the inside
shots due to missing data but from the outside minus direct free kicks 1164
were on target, 1395 were blocked and 1819 missed the net. Overall there were
3522 shots in total on target, 2902 were blocked, and 4467 missed the target and
we can also assume most of the shots that missed the target lead to an
opposition restart.
The problem
with the flow chart concept is we’re missing key stats related to creating data
that OPTA does record for example of the Shots on Target we don't know how many
of them were controlled by the goalkeeper to change over possession. So I would
suggest that OPTA starts recording the stat rebound and save leading to corner
so we have a better sense of shots on target that do not count as a goal. We
also don't have any information regarding an offensive player mistake in the
offensive zone for example a foul, wayward pass, or loss of possession out of
bounds so we can only assume when the offensive team losses possession by
looking at clearances and goalkeeper distribution which we assume the offensive
possession is over since the keeper has control of the ball.
There is also
an issue in my view with the stat clearances and as a defensive statistic the
assumption like shots is that this is a positive play although we have no
further information regarding the clearances which affect the flow chart
concept and also like total shots a misleading stat. I'm assuming clearance
includes kicking a ball out of bounds and giving the opponent the ball as a
throw-in or corner. If so well looking
at the goal per attempt stat this could be problematic because with only the
data in front of us the ratio in terms of how many attempts per goal is lower
for a corner or a throw-in then in open play. So analytically a clearance could
actually increase his opponent’s chance of scoring depending on the clearance.
Ultimately for
a flow chart you would want the play and then the lead in to the next possession
be a goal, the offense getting another offensive possession via a rebound, clearance,
or set play opportunity or the defense controlling the ball and putting an end
to the offensive possession. Based on this we can start looking at general
views in terms of possession but also specific for example if a particular
player tends to kick the ball out for a corner well under pressure a team can
use that to their advantage when breaking down opposition analysis. Or alternatively
if a team likes to shoot maybe allow it because you know a chance to score is
poor and a shot off target leads to a loss in possession.
I'm
impressed with OPTA work and with a background in most sports it amazes me the
effort they put into collecting it especially when traditionally there is a
long history of stats recorded for North American sports in far less in detail.
Although I must say that when breaking down a sport I find much more useful,
traditional North American sports data. And in doing my own soccer stats is the
same because I know what I want to get out of it before I start to tabulate the
data. I fight with teams and leagues all the time over the value of stats in
soccer and to dismiss my work they show me huge books of animated drawings in
what they believe breaks down every play in soccer. Although the reality of
what breaks down every play is actually what really happens over a good number
of samples which statistics can do, all we need to do now is prove it through
our work.
Nice flow chart. One small thing: other players than the GK can block an on target shot.
ReplyDeleteI agree with that assessment although in my analysis the offensive possession ends when defense controls the ball so in the case of a player making a goal line clearance the result of that the offensive team will retain possession or another defender will clear the ball. This analysis is also only evaluating offensive efficiency. I might do a counter article regarding defense although I don't know if there is enough data to make the same conclusions.
ReplyDeletePersonally in my own data I don't count the stat of goal line clearance or a player blocking a shot on target as yet in my analysis it doesn't have any clear understanding of the players ability other than being in the way of the shot as well goal line clearances are also quite random. The goal of my work in general is evaluating a consistent performance through statistics either using basic or detail data although so far I'm only happy with the consistent projectable numbers for a few stats.