Monday, March 5, 2012

My definition of Soccer Analyitics

The past weekend was the 2012 MIT Sloan Sports Analytic Conference. Due to the Moneyball phenomenon the conference has been very successful and I have hoped to attend have been unable to do so. During the conference a hour Is dedicated to Soccer. Originally soccer was part of the “other sports” section but there been enough interest so in 2012 it actually took place in the main conference room.

During this conference the people who have an interest in soccer also plan a meet & greet before hand and many of the people who I read and follow on social networks such as twitter attend this and from what I heard the conversation during this get-to-together is better than the conference itself.

So in general I don’t intend this article to be critical of anybody who currently is working on Soccer Analyitics but I do feel that there is a disconnect between effort behind soccer analytics and the game itself which is why in this article I want to give in my opinion a basic definition of Soccer Analytics and what I perceive its potential in the game.

I’ve been working with soccer statistics for over 20 years and in the process I’ve tabulated millions of lines of data and currently cover 60 plus leagues, 1000 plus teams and close to 70,000 players. The reason I started this work is I was a huge North American stat geek who was getting bored by those sports so started following soccer. What I then realized that in comparison to all North American sports there was no statistics for soccer.

I created a box score/template system to collect what I regarded as essential data using much of the same format used in a game with many similarities of play, that being Ice Hockey. I tested my system on local youth leagues and then started recording data for the English Premier League in 1992-1993 and the 1994 United States World Cup. Ever since I started there has been interest in the work and for the last 15 or so years I worked professionally as a Sports Statistician concentrating on Soccer.

This being said Statistics in Soccer are still not common. Much of my own work is fairly exclusive to any work out there and I’m probably the only person in the World who is familiar with these 70,000 players :) .  An unlike North American sports the general public who follow the game have no encyclopedic knowledge of statistics. Even leagues like the English Premier League people only know who scored the most goals they don’t know who had the most assists or shots on target.

So first up we’re dealing with a very na├»ve market place with no history or culture of statistics in comparison to baseball Bill James started his work 80 years after data he used started being collected. For example, there was more statistical data in relation to Baseball in 1928  New York Times than there is in the current New York Times.  I experience this daily when people unfamiliar with my work come across it. I would argue that I produce the least complex type of work yet many readers are over whelmed by the amount of data and what it means.

On the positive side it’s an open and overly underdeveloped market with a potential audience of billions to begin to get it.  I feel soccer statistics alone have been poorly marketed as a commercial entity with misrepresentation in broadcast and media, limited amount of information in books, magazine and newspapers and no properly developed fantasy game or devices that would use statistics.

Even more advantageous is this lack of general knowledge increases the effect of Soccer Analytics over any other competing sport if used in a real life scenario. The other reason why Soccer Analytics has so much great potential is the business of the game it’s self.  The money in soccer worldwide is greater than all other sports combine and they use basically unregulated business system which instead of dealing purely with commodities through analytics you can actually generate real money. This season there has over $US 4,000,000,000 in player sales alone which with inflation has been the norm for over the past decade.

In 2009 Manchester United sold Critiano Ronaldo to Real Madrid for close to $US 200,000,000 a few years before they bought him of Sporting Lisbon for around $US 40,000,000 that means Manchester United generated over $US 160,000,000. In comparison the Milwaukee Brewers lost their player Prince Fielder and in return they got the 27th pick in the draft.

So now understanding the importance of Soccer Analytics we than need to ask ourselves what can we do with it. I think what we need to understand is the root of Bill James work and that being through his observation of the game he was confused why certain things were happening and start to write them down as questions and originally his books (which I read 90% of them starting from the age of 11) try to answer these questions.

So let’s start asking questions about soccer.

First up I would suggest ignore all game used analytics, I think the smartest thing Billy Beane has chosen to do is not to watch the game. In comparison if you owned a store you would not watch on a daily basis what people were buying instead after a set period of time you would analyze the data and create conclusion on how you are doing and what you need to improve.

To me the most important questions in Soccer Analytics are:

How to evaluate the transfer value/salary of a player?

What is the difference of quality between leagues? (including as part of youth development)

Although in the process of my work I’ve come up with hundreds of question in relation to Soccer Analytics. Examples include:

Are away wins more valuable than home wins?
At what age should a player be bought?
At what age should a player be sold?
Can a team develop a goal scoring style?
Can you sell a player you picked up on a free transfer?
Do corners lead to more goals on average than crosses?
Do fouls lead to yellow and red cards?
Does a club perform better after close or large loss? is there a difference?
Does a team discipline hurts their chances for success?
Does ball possession lead to success?
Does having defender who can score benefit a team?
Does having defender who create assists benefit a team?
Does loaning a player out benefit his career?
Does shots missing the net lead to goals?
Does stadium have a factor on how an individual game is played?
How beneficial is promotion to the development of a club?
How big is the jump from division to division?
How difficult is relegation to the development of a club?
How many days between games till a team properly recovers?
How much consistency is there in a player’s career?
How much detriment is a red card?
How much does the size of a goalkeeper matter?
How much influence does a manager have on a player?
How much influence does a manager have on how many goals are allowed?
How much influence does a manager have on how many goals are scored?
How much of a role does injuries play in a clubs failure?
How much of a role does the referee tendencies have on a game?
How much value is a draw over a loss in a playoff battle?
How much value is a draw over a loss in a relegation battle?
How to properly evaluate talent?
How valuable is a defensive midfielder?
How valuable is an assist in soccer?
How valuable is save percentage in soccer?
Is a specialist (long throw, free kick) on the pitch worth it?
Is important that your central defenders are taller than other players on the pitch?
Is it better to have one primary goal scorer or scoring shared amongst the team?
Is it easier to score with your head or outside the box?
Is it more beneficial to have a good offense or defense?
Is it worth taking a direct free kick on net?
Is playing offside trap worth the risk?
Should a club change it's formation after receiving a red card?
Should you never sell a player?
Should you pay a transfer fee for a player?
Should you recruit a player for a particular roles or just talent?
What characteristics does a top class defender have?
What constitutes a successful season?
What is the best format for a fantasy soccer game?
What is the home field advantage in soccer?
What style of play raises a player transfer value the most?
When does a player reach his prime?
Which plays a bigger role the system or the players ability?
Which position should you put the most resources into?
Who should take corners/free kicks?
Who should take penalties?

I do feel we have the material and the process to answer all of these questions in a analyitical way and that would provide interesting reading but more importantly a benefit to a club and an influence on how the game is played.

My emphasis is on my two core questions although I feel I incorporating a lot of the other questions in the process and if I find answers I will post my results. This being said the work takes a huge amount of time and effort, and at this point other the reward of accomplishment and smugness the return is underwhelming because like Bill James story the work primarily dismissed by everybody who has any connection to the game.

Will this change?  I assume it will but to be honest with you I’m very skeptical because the sports industry is a very egocentric and impractical business process where facts and stats get in the way of person thinking they know better. On the positive side unlike many things you can be judged by the success of your actions in sports so hopefully someone will be given the opportunity to change the game for the better.

