Henshaw Analysis player ratings — methodology, discussion & examples
Recruitment in football is difficult. There are hundreds of matches every week. How do scouts know what games to go to? Well, that all depends on the club they’re working for. Some will decide where they want to go because they fancy it. Whereas scouts working at smarter clubs will be directed and informed by recruitment analysts — that’s where I come in.
Player ratings were something I started to develop during my previous role at a football club during the summer of 2021. Ratings weren’t new to me, but they were something I’d never really developed before myself. I quickly realised the advantage and need for ratings within a club which has a relatively small scouting department. After all, being able to decide where your scouts are most likely to watch a player suitable for your first team is invaluable.
Anyway — since I left that role I put the ratings to one side to learn Python properly and focused on that. But after I watched Andy Watson’s video discussing his Role Scouting System, and read his blog post regarding his system, I was inspired to pick up where I had left off. Andy described how he looks at positions and breaks them down into different technical aspects using various metrics; something which I thought was a great idea. So after some discussions with him I set about using a similar basis to improve my ratings system. So if you don’t already please follow Andy on twitter. He’s a great person who puts out some great content suitable for every football fan.
I’d also like to say thank you to Jay Socik and George Syrianos for some early ideas and help around weightings and mathematical distributions. They’re both great, intelligent people, who are willing to help and share their knowledge.
Methodology
The first thing I did was isolate the positions I wanted to look at. These are fullback / wingback, centre back, centre midfielder, attacking midfielder / winger and forward.
Similar to Andy Watson’s work, I wanted to break those down further for some of the positions, to find different types of players who can play in specific roles. Centre midfield is broken down into defensive midfield and rounded midfielder (more of a number “8” type). Attacking midfielder / winger is broken down into ball carrying and creative players, while forwards are broken down into target man and complete forward.
I’m not saying the above is perfect, or the correct way to break the positions down, but for me it works and it allows me to identify the different types of player I want to see. If I was working for a club, these positions and roles would be specific to what the club wants to see. In fact, you’d probably want to break down positions further still. For example, different types of centre backs for ball carrying, ball playing and aerial dominance.
Example — Centre Backs
As mentioned you could break this position down further, but I’ve decided to encompass the key aspects of a modern day centre back into these ratings. Here’s the what I’m looking for in this position:
- Defensive ability
- Aerial ability
- Positioning
- Ball carrying
- Ball playing
- Attacking threat
Each of these attributes are broken down by different metrics which are available on Wyscout. These can be seen in the image below.
All the metrics above are exported and loaded into Python. Here I use the function “z-score” to get all the scores for each player and each metric. This is a mathematical distribution which describes the position of a raw score in terms of its distance from the mean, when measured in standard deviation units. Essentially it allows for the values for each metric to be standardised in accordance with the mean and standard deviation. This, unlike percentiles enables us to better reflect the difference between the players.
For example let’s say Rob Dickie completes 6 progressive runs per game, and he’s the best in the league at that metric. He would be in the 100th or 99th percentile. But if the next best centre back is only completing 2 progressive runs per game, it’s not really fair for them to be in the 98th percentile. Using z-scores therefore better illustrates the difference between players.
Z-scores are on a scale, with a score of 0 being the mean (average), a positive value being above average and a negative value being below average. The score will tell you how many standard deviations above or below the mean you are.
I take these values and scale them to 100; this allows the numbers to be more informative and easier to understand. Anything with a score above 50 would be considered above average.
So, we’ve got all the scores for each player and metric, now we need to weight all of the metrics so they create the overall attribute scores. Below you can see how I have weighted each metric which make up the attribute score for a centre back. Just to reiterate, these weightings are very much what I deem as appropriate and what is important to me, they might look different to someone else.
One metric which is included across all positions for player ratings is minutes played. Availability is possibly one of the most overlooked metrics in football. It seems harsh to rate a player with 800 minutes better than a player with 2,500 minutes when they’re showing similar on the numbers. Being 1st choice isn’t the be all and end all, but it should be something which is considered.
Here’s how we’re looking with six different values which are derived from the attributes we identified at the start of the process, along with the minutes played value.
We now have to do some weighting again. Which of the attributes is more important to a centre back? Again this down to your interpretation of the game and what you want from your defenders. Personally I want it to be tilted towards those more progressive defenders who can carry and pass the ball well. You can see my weightings below:
Others may think the defensive and aerial ability is more important. You could quite easily run these figures with different weightings and have them saved as different templates.
The values that come from the weightings above are then added together to create the overall centre back rating. Here’s an example of the top 15 rated centre backs in League One so far this season (2021–22).
It’s worth remembering a rating of 50 is considered average, so for Accrington’s Ross Sykes to have a rating of 61.40 is impressive. Notice there is only a couple of points different between him and Bolton’s Ricardo Santos, again very impressive.
For context, I have Dominic Hyam as the best rated centre back in the EFL with a rating of 64.41, while Sykes ranks 4th when compared to the whole Football League, which is very good for 22 year old. I have no doubt we’ll be seeing him in the Championship soon. You can see why he has flagged up, looking at his pizza chart below he’s very well rounded. He offers an attacking threat, progressive ability through both passes and runs, along with putting up some solid defensive numbers.
Limitations
I’m very aware that these ratings are not perfect, but I do feel they’re a good starting point. So I thought I would acknowledge some of the limitations of the current system. Going forward I’ll be looking to improve these limitations and will hopefully get some good feedback from this blog post to help.
Data quality
This is a topic I’ve spoke about before, but Wyscout doesn’t have the best data quality. Some of its values, especially the accuracy/success percentages can be inaccurate at times. It would be best to use data from Statsbomb, or OPTA for example. On the whole Wyscout data is okay, you just need to sense check some of the metrics now and again.
To be fair to Wyscout, they cover a vast amount of leagues, and have a lot of metrics to play with. Also, as I’m mainly interested in the EFL there aren’t any other places to get this amount of data. So Wyscout it is.
Restricted to positions
Positions in the modern game are fluid. We often see players in different roles throughout the game regardless of what their position was at the start of a game. An example of this is those two forward players behind the striker in a 3–4–2–1. Are they forwards? Are they attacking midfielders? It all depends on their role within in team. This is where I could expand to create more ratings for different roles within positions.
Furthermore, I’m at the mercy of Wyscout as to what position they list a player as. If it has a player’s minutes split across attacking midfield and as a forward, their main position will be decided by the position they have played the majority of their minutes. An example of this would be Chris Willock. Wyscout has 42% of his minutes as a forward, but also 32% of minutes as an attacking midfielder. I’d argue he’s probably more of an attacking midfielder than a forward. As a result, some of these players with split minutes might not appear in the correct data set when exporting data from Wyscout.
Team strength / league ratings
Currently my ratings don’t take into account team or league strength. I feel the former is more of an issue though as I would normally assess the ratings league by league.
Adding team strength and an appropriate weighting would somewhat negate the natural bias towards those players who play for the stronger teams in the league.
Interpretation of what is relevant
What metrics I use and how I’ve weighted them are my interpretation of what I think is right. This might not be what you think is important and it might not be what the head of recruitment thinks is important.
I’ve not seen any work out there which tells us what metrics are more important for certain positions. For example is progressive ability for a centre back worth more than their aerial ability? I don’t know. This leaves the weightings open to interpretation — it would be interesting to see if there is any future work on this.
Despite this being somewhat a limitation, it can also be viewed as a positive in that the ratings can quickly be tailored towards what a director of football or head of recruitment deems important for their team.
Future developments
I’m really happy with how the ratings have turned out. Using the z-scores have helped make them feel more representative, as my old ratings were based around percentiles. This being said there are always improvements which can be made, here’s a few ideas:
- Team strength to be added which will apply an appropriate weighting onto the overall rating.
- Breaking down positions further into roles to allow more granular ratings on different roles.
- Potential to add in more advanced metrics such as expected threat (xT) for appropriate roles.
- Automate visualisations from the ratings generated within Python, for both the overall rating and attribute ratings to see where players standout in a granular sense.
Final thoughts
I’m hoping this blog post will provide some insight and context around how my ratings are derived. This tool is mainly used to focus in on potential players of interest from a wider pool — which I feel it does pretty well. There is still room for improvement and I’m hoping to continue with that over the coming weeks and months.
I’d very much love to get some feedback, and open up some discussions on weightings, metrics used or anything to do with the ratings. You can chat to me on twitter (@henshawanalysis), or feel free to drop me an email using henshawanalytics@gmail.com.
If you’re wanting to take a look at some ratings for another position, I recently posted my EFL fullback / wingback ratings on twitter.
If you did enjoy it would be great if you dropped me a follow on Medium and twitter.