1 |
Dear @ELOn addicts,
|
1 |
Dear @ELOn addicts,
|
2 |
\n
|
2 |
\n
|
3 |
today I'm not presenting you the next step in the evolution of rating systems. No, I'll directly jump over the dark ages of TrueSkill and Glicko to the best of the best. La crème de la crème des classements.
|
3 |
today I'm not presenting you the next step in the evolution of rating systems. No, I'll directly jump over the dark ages of TrueSkill and Glicko to the best of the best. La crème de la crème des classements.
|
4 |
\n
|
4 |
\n
|
5 |
= Whole History Rating =
|
5 |
= Whole History Rating =
|
6 |
== Advantages ==
|
6 |
== Advantages ==
|
7 |
* Smurfs are quickly put in their appropriate skill group without negatively affecting the rank of people they play against at the beginning <br />An Elo reset would have less negative impact
|
7 |
* Smurfs are quickly put in their appropriate skill group without negatively affecting the rank of people they play against at the beginning <br />An Elo reset would have less negative impact
|
8 |
* Pluks are recognized
|
8 |
* Pluks are recognized
|
9 |
* Ladders are more accurate and less affected by the outcome of single games
|
9 |
* Ladders are more accurate and less affected by the outcome of single games
|
10 |
* Skill development during inactivity is simulated
|
10 |
* Skill development during inactivity is simulated
|
11 |
== Uncertainty ==
|
11 |
== Uncertainty ==
|
12 |
WHR keeps track of skill variations and assigns a region of confidence to every value, effectively showing how accurate the value is. This uncertainty increases when no games are played for some time.
|
12 |
WHR keeps track of skill variations and assigns a region of confidence to every value, effectively showing how accurate the value is. This uncertainty increases when no games are played for some time.
|
13 |
\n
|
13 |
\n
|
14 |
== Rating History ==
|
14 |
== Rating History ==
|
15 |
A "rating" is always the whole history of a player's skill, from when he started playing zk to his latest game.
|
15 |
A "rating" is always the whole history of a player's skill, from when he started playing zk to his latest game.
|
16 |
\n
|
16 |
\n
|
17 |
== Time travel ==
|
17 |
== Time travel ==
|
18 |
Ratings are always adjusted as a whole. This means that past skill values are changing all the time. For example naturally good players will end up with a high starting value.
|
18 |
Ratings are always adjusted as a whole. This means that past skill values are changing all the time. For example naturally good players will end up with a high starting value.
|
19 |
\n
|
19 |
\n
|
20 |
So if you lost vs Godde in one of his first games, the rating system will discover that Godde is actually a very good player and make sure your rating wasn't negatively affected by losing to him while he wasn't properly rated yet.
|
20 |
So if you lost vs Godde in one of his first games, the rating system will discover that Godde is actually a very good player and make sure your rating wasn't negatively affected by losing to him while he wasn't properly rated yet.
|
21 |
\n
|
21 |
\n
|
22 |
Another example would be a group of friends always playing with each other. With the current system their ratings would form a local system that is not affected by how it compares to outsiders. With WHR, if a single member of the group went and played vs an outsider, the ELO of all group members would be adjusted.
|
22 |
Another example would be a group of friends always playing with each other. With the current system their ratings would form a local system that is not affected by how it compares to outsiders. With WHR, if a single member of the group went and played vs an outsider, the ELO of all group members would be adjusted.
|
23 |
\n
|
23 |
\n
|
24 |
== Examples ==
|
24 |
== Examples ==
|
25 |
Enough talking, let's see some graphs!
|
25 |
Enough talking, let's see some graphs!
|
26 |
(all time 1v1 ratings)
|
26 |
(all time 1v1 ratings)
|
27 |
\n
|
27 |
\n
|
28 |
These ratings are centered around zero, so the average nub will have 0 rating.
|
28 |
These ratings are centered around zero, so the average nub will have 0 rating.
|
29 |
\n
|
29 |
\n
|
30 |
https://i.imgur.com/HXJuDZx.png
|
30 |
https://i.imgur.com/HXJuDZx.png
|
31 |
https://i.imgur.com/AaoCWnV.png
|
31 |
https://i.imgur.com/AaoCWnV.png
|
32 |
https://i.imgur.com/F5HTOnl.png
|
32 |
https://i.imgur.com/F5HTOnl.png
|
33 |
https://i.imgur.com/2vJqyMl.png
|
33 |
https://i.imgur.com/2vJqyMl.png
|
34 |
https://i.imgur.com/UcsxKV1.png
|
34 |
https://i.imgur.com/UcsxKV1.png
|
35 |
\n
|
35 |
\n
|
36 |
Randy started higher than Firepluk ever went :/
|
36 |
Randy started higher than Firepluk ever went :/
|
37 |
\n
|
37 |
\n
|
38 |
And to demonstrate the algorithm doesn't just boost everybody's ego by starting above zero
|
38 |
And to demonstrate the algorithm doesn't just boost everybody's ego by starting above zero
|
39 |
https://i.imgur.com/OCrc1yn.png
|
39 |
https://i.imgur.com/OCrc1yn.png
|
40 |
But he quickly made his way up ;)
|
40 |
But he quickly made his way up ;)
|
41 |
\n
|
41 |
\n
|
42 |
If you're wondering what the x-axis means: It's the battleID divided by 200 ("Days"). The system assumes that skill stays constant within one day. The Y-axis is just the minimum and maximum of the confidence interval with the center being marked as well.
|
42 |
If you're wondering what the x-axis means: It's the battleID divided by 200 ("Days"). The system assumes that skill stays constant within one day. The Y-axis is just the minimum and maximum of the confidence interval with the center being marked as well.
|
|
|
43 |
\n
|
|
|
44 |
Original publication: https://www.remi-coulom.fr/WHR/WHR.pdf
|
|
|
45 |
My implementation is based on [url=https://github.com/goshrine/whole_history_rating]an existing ruby implementaiton[/url]
|