Back2Warcraft's Elo Rating

Back2Warcraft's Elo Rating is a rating list of the Warcraft 3 professional players curated by Back2Warcraft. The ratings are calculated based solely on map scores in high level tournaments. The ratings do not take the eventual tournament position into account, or indeed any other factors such as prize money and ladder standings.
The ratings are calculated monthly, and released at the beginning of the month. They are based on more than 2500 high level games since December 2015.
Rules[edit]
The ratings use the following standards:
 The first games entered to the system are WCA 2015, where all players were given an initial rating of 2000
 The rating period is one month
 The ratings prior to November, 2016 is considered part of the transient response and not trustworthy
 All games must come from tournaments where at least half the players already have a rating (preferably) or a well established provisional rating
 The minimum number of games to get a rating is 12, before that the players have a provisional rating only
 The minimum number of games to be included in the rankings is 15
 There is no rating decay; the rating will not decrease if a player is inactive
 There is a decay on number of games played for the player, so inactive players will be dropped from the rankings after a period of no games
Calculations[edit]
The calculations are based around Arpad Elo rating system first implemented in chess in 1960. Back2Warcraft’s Elo Rating, while following the basic principles of this system, are implemented using the slightly impoved and more modern version currently in use by the USCF. A straightforward explanation of mathematics can be found in a paper by Mark E. Glickman, a rating system expert, in Chapter 3 and beyond.
The main differences between this variant and the more traditional Elo system is the variable Kfactor and the antiinflation system described by the special rating formula, see the section below.
The Elo rating score is calculated in three basic steps, predict, compare and adjust.
Predict[edit]
The first thing the system does when considering a new result is to predict what the outcome should have been. So before looking at the actual result, the system calculates the likelihood of each player winning the game based on their rating. This likelihood is called the expected score of each player, where the score is simply defined by win = 1, loss = 0. Since only one player can win, the sum of the expected scores is 1.
So in a game between two players rated P1=2000 and P2=1900, the expected scores are 0.64 for P1 and 0.36 for P2. Or in other words, the higher rated player is expected to win 64% of the time if the ratings are true. Since P2 is expected to score 0.36 points each time these players compete, if there’s a situation where they play each other five times the total expected score is 0.36 x 5 = 1.8. Similarly, P1 has an expected score of 0.64 x 5 = 3.2.
Compare[edit]
The predicted score is then compared to the actual score. For each player the difference between these is called the delta. This value is a measure of how the player is performing compared to what could be expected given their current rating. So a positive delta means the player is overperforming his rating, a negative delta means underperforming.
So continuing with the example above, given a 10 victory for P1, the delta is set to (Actual score – Expected score) = (1 – 0.64) = +0.36. For P2 the delta is (0 – 0.36)=0.36. P1 did slightly better than predicted, P2 slightly worse. If we change up the score, and now let the weaker player win with a 10 victory for P2, the deltas become 0.64 and +0.64. So, not surprisingly, when an upset occur, the deltas are higher.
Now, given a five game series between P1 and P2 with a score of 32 victory for P1, the delta of P1 would be (3 – 3.2) = 0.2. The delta of P2 would be (2 – 1.8)=+0.2. The relatively small deltas indicate that the ratings of the players are true, and that little adjustment is necessary. A 32 score is right in line with what we expect when a player rated 2000 plays against a 1900.
Taking the sum of all the deltas of a player over the rating period then gives one number indicating whether the rating should increase, decrease or stay as is.
Adjust[edit]
The final step is then to create a new rating for the player, more accurate to how the player is actually performing. The new rating is given by the old rating plus the adjustment based on the delta described above, or mathematically: Rating = OldRating + K*delta, where K > 0.
If the player has overperformed, delta is a positive number, and the new rating will be higher than the old. If the player has underperformed, the delta will be negative and the rating decrease. The conversion from a delta to an actual adjustment is done by multiplying delta with the factor K, called the Kfactor
The KFactor[edit]
One of the key characteristics of a rating system is how much weight to put on any one result. In an Elo system this size is called the Kfactor. A high Kfactor puts a lot of weight on new results, while a low value is the opposite.
So if there’s a low Kfactor, the system rewards strong performances consistently over time, where players are only slowly climbing or falling on the list. Such a list can feel a little too slow, where the list will lag behind reality.
If the Kfactor is high, the system is very quick to adjust. This means that that the most recent tournament results are considered as the most important. This can be problematic as the list becomes too volatile, with players rising and falling many places every rating period, and the rankings merely a copy of the last tournament placements.
The solution then is to have the Kfactor vary. For each player, every rating period the system calculates a unique Kfactor based on the level of the player, the number of games played previously and the number of games in the current rating period. Thus, for established pros with many registered games the ratings will put relatively little stock in just a single result, but for a newcomer with almost no registered games the system will quickly adjust the rating based on fresh results.
Rating inflation[edit]
Prediction[edit]
Rating difference  Bo1  Bo3  Bo5  Bo7 

+400  82%  92%  96%  98% 
+300  76%  86%  90%  94% 
+200  68%  76%  81%  85% 
+150  64%  71%  75%  78% 
+100  60%  64%  67%  70% 
+50  55%  57%  59%  60% 
+25  52%  54%  55%  55% 
0  50%  50%  50%  50% 
Current rating[edit]
The current ratings was released 1st of January, 2018.
Rank  Δ  Player  Rating  Δ  Played  Exp. Score  Score 

1  ()  Happy  2229  (6)  62  44.8  44 
2  ()  TH000  2176  (+26)  13  7.9  10 
3  (+2)  120  2151  (+50)  14  9.0  13 
4  (+2)  FoCuS  2117  (+18)  81  49.0  52 
5  (2)  Moon  2105  (30)  4  2.0  0 
6  (2)  Lyn  2102  (+0)  0  0.0  0 
7  (+3)  Life  2094  (+18)  37  23.1  25 
8  (1)  Foggy  2090  (7)  50  30.9  30 
9  (1)  Infi  2075  (15)  23  13.3  12 
10  (1)  LawLiet  2054  (22)  22  14.0  12 
11  (+4)  Romantic  2019  (+47)  31  15.3  20 
12  ()  WFZ  2016  (+0)  0  0.0  0 
13  (2)  Fly100%  1968  (60)  14  7.8  3 
14  (1)  Check  1959  (21)  32  17.1  15 
15  (1)  ReMinD  1959  (16)  3  1.0  0 
16  (+6)  Lucifer  1907  (+34)  53  26.0  30 
17  (1)  ReprisaL  1903  (9)  2  1.3  1 
18  (1)  Yumiko  1892  (+0)  0  0.0  0 
19  (+8)  So.in  1887  (+69)  20  9.0  14 
20  (+5)  OrcWorker  1875  (+51)  39  15.1  20 
21  ()  Zhou_Xixi  1874  (+0)  0  0.0  0 
22  (4)  Cash  1873  (10)  12  4.7  4 
23  (3)  XiaoKK  1867  (8)  8  3.5  3 
24  (5)  Colorful  1857  (22)  11  5.5  4 
25  (2)  Sok  1850  (14)  10  4.9  4 
26  (2)  Chaemiko  1841  (+1)  11  5.0  5 
27  (1)  Sini  1830  (+7)  25  10.4  11 
28  ()  Yange  1813  (+0)  0  0.0  0 
29  ()  HawK  1799  (+0)  0  0.0  0 
30  ()  Rudan  1790  (+0)  0  0.0  0 
31  ()  tbc_bm  1788  (+3)  22  7.8  8 
32  (+1)  Sonik  1767  (8)  24  9.6  9 
33  (1)  Blade  1766  (16)  16  6.0  5 
34  ()  Sheik  1730  (+4)  9  1.8  2 
35  (+3)  Fast  1726  (+20)  11  2.9  4 
36  (New)  福建男人  1721  (+36)  14  4.9  6 
37  (1)  XeLSinG  1714  (+0)  0  0.0  0 
38  (3)  无道oc  1708  (11)  3  1.3  1 
39  ()  Anima  1694  (4)  3  1.2  1 
40  (+1)  EleGaNt  1691  (+0)  0  0.0  0 
41  (4)  HurricaneBo  1683  (29)  2  0.7  0 
42  ()  EnChant  1671  (2)  13  4.1  4 
43  ()  pingge  1637  (16)  2  0.4  0 
44  ()  RaZZoRMaN  1632  (7)  4  1.3  1 
45  (+1)  Starshaped  1592  (6)  5  1.2  1 
46  (+1)  TGW  1569  (22)  5  0.8  0 
47  (+3)  Edo  1558  (4)  4  1.1  1 
48  ()  Imperius  1550  (29)  5  1.1  0 
49  ()  14sui  1530  (47)  6  2.4  1 
50  (+1)  Deathnote  1478  (17)  8  1.7  1 