/cdn.vox-cdn.com/uploads/chorus_image/image/37190382/5911064694_10f6bd220b_o.0.jpg)
This post walks through the process we adapted to create the Top 100 Soccer Clubs in the United States and Canada and the Elo rating that drives the ranking. The score can be used as a way to predict the outcome of games, the formula for which is also laid out below.
The Top 100 Soccer Clubs in the United States and Canada uses a modified Elo rating system to rank the teams and also provides a score to compute the probability of outcomes should any two rated teams face each other. This rating includes the results of over 4,600 competitive games going back to the 2011 season.
The Elo rating system, developed by Hungarian-American mathematician Dr. Árpád Élő, is used by FIDE, the international chess federation, to rate chess players. In 1997 Bob Runyan adapted the Elo rating system to international football and posted the results on the Internet. Given how the soccer intelligentsia romanticizes a connection between soccer and chess, the use of this formula makes sense.
Footballdatabase.com uses Bob Runyan's methodology to rank club teams around the world. FIFA uses a modified version of Bob Runyan's adaptation to rank Women's International Soccer teams.
The Top 100 Soccer Clubs in the United States and Canada uses a modified version of Bob Runyan's methodology that accounts for unique factors of the league structures and tournaments in the United States and Canada.
The Elo system was adapted for soccer by adding a weighting for the kind of match, an adjustment for the home team advantage, and an adjustment for goal difference in the match result.
After each game a team's score is adjusted based on the following factors:
- The team's old rating
- The importance or weight of the match
- The goal difference of the match
- The result of the match including home field advantage
- The expected result of the match
Strengths of the Rating System
The advantage of the Elo system is that it is a thorough assessment of a team's historical results. The system uses the results of every match, including the goal difference, and weights the outcome based on the importance and the expected outcome.
Weaknesses of the Rating System
The importance of each match is pre-determined by the rating designer. Therefore the weighting of each match has a subjective element embedded in the score.
In addition, each team must start with a score. That score is pre-determined by the rating designer based on the perceived strength of the team's schedule. Teams with a strong strength of schedule will start with a higher score. Teams with a lower strength of schedule will start with a lower score. The difference between these starting scores is an important but partially subjective choice by the designer.
The goal difference factor as determined in the original adaptation was very strong. A team that wins a game by two goals will get 50% more points for that outcome than if they had won by 1 goal. A three goal difference gets 75% more than 1 goal difference.
Changes to the Elo System to Address Weaknesses
To minimize the impact of the subjectivity concerns the United States and Canada ratings use actual match results to determine the pre-set ratings and weighting. The United States and Canada clubs primarily play in a 4 tier system, with MLS, NASL, USL PRO and PDL comprising the 4 tiers. Results from the U.S. Open Cup matches staring in 2010, which includes teams from all 4 tiers, were used to determine the weighting of matches played in each league as well as the starting score for each team. The analysis looked at the goal difference of interleague games in the U.S. Open Cup, adjusted for home advantage. The results are listed below.
Future U.S. Open Cup matches will influence the scores going forward. If a league performs better in the future, the weighting of games and the staring score for new teams will adjust going forward.
The goal difference factor was cut to roughly half of the impact through changes to the formula. FIFA performed a similar adjustment when using the Elo system to rank International Women's Soccer teams.
The different weights of the various competitions are as follows:
Game |
Weight (K) |
MLS Cup Final |
50 |
MLS Cup Semifinal |
50 |
US Open Cup Final |
40 |
US Open Cup Semifinal |
40 |
Canadian Championship Final |
40 |
MLS Playoffs |
40 |
Canadian Championship |
30 |
US Open Cup |
30 |
MLS |
30 |
NASL Playoffs |
30 |
USL PRO Playoffs |
25 |
NASL |
21 |
US Open Cup Preliminary |
20 |
USL PRO |
15 |
PDL Playoffs |
15 |
PDL |
5 |
The different starting scores of the various leagues are as follows:
League |
Starting Rating |
MLS |
1400 |
NASL |
1200 |
USL PRO |
1100 |
MLS Reserve |
1100 |
Other |
1000 |
PDL |
1000 |
|
This is the value for R_o (below) for the team's first match in the database.
The Basic Calculation
The Elo system has one formula which takes into account the factors mentioned above.
The ratings are based on the following formula:
R_n = R_o + K*G (W - W_e)
Where;
R_n = The new team rating
R_o = The old team rating
K = Weight index regarding the tournament of the match
G = A number from the index of goal differences
W = The result of the match
W_e = The expected result
Goal Differential = G
The number of goals is taken into account by use of a goal difference index. G is increased by 25% if a game is won by two goals, and if the game is won by three or more goals by a number decided through the appropriate calculation shown below;
If the game is a draw or is won by one goal
G = 1
If the game is won by two goals
G = 1.25
If the game is won by three or more goals
G = (11+N)/10
Where N is the goal difference
Result of the Match = W
W is the result of the game (1 for a win, 0.5 for a draw, and 0 for a loss).
Expected Result of Match = W_e
W_e is the expected result (win expectancy with a draw counting as 0.5) from the following formula:
W_e = 1 / (10^(-dr/400) + 1)
Where dr equals the difference in ratings plus 100 points for a team playing at home. So dr of 0 gives 0.5, of 120 gives 0.666 to the higher ranked team and 0.334 to the lower, and of 800 gives 0.99 to the higher ranked team and 0.01 to the lower.
This formula is calculated for each team in each game and the resulting score is carried forward to set the expectation for each team's next match.
Calculating the Probability of Game Outcomes
The expected result formula can be used to predict the outcome of games between two scored teams.
Take an average USL PRO team with 1100 points at home against an average MLS team with 1400 points. Add 100 to the USL PRO team for home field advantage.
To calculate the USL PRO team's odds of winning (including a draw being worth .5 points) use the formula above
W_e = 1 / (10^(-dr/400) + 1)
The dr for the USL PRO team is 1200-1400 = -200. Plugging -200 into the equation yields a result of 24%. Plugging in 200 would yield the result of 76%, the odds of winning for the MLS side.