It’s been a long wait, but the fixture list is now out (hurray). This means we can start some planning for the upcoming fantasy football season, by seeing when teams have easy/hard run of fixtures.
Last season, I split teams into categories (eg expect top 4 finish) and estimated expected scores on how these types of teams scored against each other historically. One flaw with this method is it doesn’t take into account style of play, so Man U v Burnley is assumed to have the same score as Man City v Burnley.
This season I’m going to use a rating system for each team to determine expected goals for a match and hence when a team is likely to have a run of more high scoring games etc. The rating system was inspired by Professor David Spiegelhalter @d_spiegel and details are given at the end of the article.
Fixtures and initial ratings
The link to the spreadsheet is:
The first page “Ratings” has the current attack and defensive ratings per team for home and away. These update during the season and are in columns D:G. The efficiency rating are in column AB:AE and stay constant for the season.
The expected number of goals in each match is given in columns W:X
For each team I have rated there matches by difficulty for scoring goals (see “Attack” page) and conceding goals (see “Defence” page). In row 29 and below, each team has how many goals they are expected to score/concede by game week. In the table starting in row 6 is the opponent and the colour indicates difficulty. Red (hardest), light red (hard), yellow (average). Light green (easy) and green (easiest).
In cell C3 we can choose how we measure. By selecting team, the five categories at split compare to all their matches during the season. By selecting League this compares difficulty compared to all teams in the league.
I haven’t looked to see which teams have easy/hard run of fixtures at the moment. I do plan in incorporating this into the forecasting model that I used last season.
Fixtures and pairs
In addition to the rating system, I have produced a fixtures and pairs spreadsheet linked here:
For the tables page, you can type in the scores during the season and the league table will update accordingly. Please note scores are entered in columns E and F.
In the page called pairs, you can select a team in cells C3 and E3 and it’ll give you a list of their fixtures that are colour coded. You can decide the colours a team has in columns P to T (note that if a team appears twice the code won’t work).
Finally, I have listed which teams are paired in columns M and N. I have separated Arsenal, Spurs, Cardiff and Watford as they don’t pair in one week due to Spurs new stadium. By pairs, I mean when one team is home the other team will be playing away. This is useful to know if wishing to rotate between two cheap defenders, keepers, etc
The rating system
Most thoughts aren’t original and my rating system is no exception. It is based on an article written by Professor David Spiegelhalter who also did predictions on the BBC site a few years ago (http://news.bbc.co.uk/1/hi/programmes/more_or_less/8062277.stm).
Instead of using a system based on goals, I’ve created a rating for a team based on xG. So for a home attack rating this is calculated by the average number of xG a team scores at home divided by the average number of xG all teams scored at home. This is based on the previous seasons from https://understat.com/ (newly promoted teams ratings are based on historic performance of promoted teams). I’ve taken a weighted average over the last two seasons, so begin this season with:
The average number of goals for a home team is 1.5 and 1.1 for the away team, so to estimate a home score we have:
Home score = average goals per home team x home team attack rating x away defence rating.
Away score = average goals per away team x away team attack rating x home team defence rating.
At the beginning of the 17/18 season Arsenal had a home attack rating of 1.28 and a home defence rating of 0.93, while Leicester had an away attack rating of 0.90 and an away defence rating of 1.14 so when Arsenal played Leicester in the opening day of the season in 17/18 I’d estimate the xG as:
Arsenal xG = 1.5 x 1.28 x 1.14 = 2.19
Leicester xG = 1.1 x 0.90 x 0.93 = 0.92
The xG for the match was 2.54 – 1.47 (match score was 4-3). The estimate for xG seems reasonable, though a bit on the low side. This could be due to random variation, game flow or our original ratings were incorrect. Given the xG in the match 4-3 is a low expected probability of occurring (this will be discussed later).
Teams’ ratings should be adjusted throughout the season due to form or just they were incorrectly estimated initially. For this I look at the percentage difference between the actual xG and the forecast xG. The square root of this figure would be the uplift required to estimate the rating of the team. I found that adjusting the team’s rating by 10% of this uplift give better overall estimates.
So in the above example, Leicester were forecast to have 0.92 xG whereas they actual had 1.47 xG giving a ratio 1.6 (1.47/0.92), so an uplift to be applied square root of 1.6 is 1.27 meaning the ratings were 27% higher than originally forecast for this match. So we adjust our ratings upwards by 2.7% (10% of 27%), so Leicester new attack rating is 0.90 x 1.027 = 0.92 and Arsenal’s new defence rating is 0.93 x 1.027 = 0.96
I have always thought that xG doesn’t give the full picture and some teams are more efficient in converting chances. Not because the players are better at shooting (which they may be) but because they create better opportunities. This is because data collected for xG doesn’t account for many variables like speed of the ball in a cross, its height, pressure by defenders, defenders being able to block the shot etc. There is more random variability in this factor so I have taken an average over the last four seasons to give:
During the 17/18 season, I collected expected goals from a spread betting company and measured the average squared error per prediction to be 1.27 and the method above for the same games was slightly worse but was also 1.27 (rounded to two dp). This indicated the method isn’t too different from the bookie so suggests a reasonable estimation.