Loading...
  OR  Zero-K Name:    Password:   

Post edit history

Evaluating rating systems

To display differences between versions, select one or more edits in the list using checkboxes and click "diff selected"
Post edit history
Date Editor Before After
7/21/2022 10:14:17 AMGBrankfiendicus_prime before revert after revert
7/20/2022 8:38:37 PMGBrankfiendicus_prime before revert after revert
7/20/2022 8:37:57 PMGBrankfiendicus_prime before revert after revert
7/20/2022 8:37:36 PMGBrankfiendicus_prime before revert after revert
7/20/2022 8:37:00 PMGBrankfiendicus_prime before revert after revert
Before After
1 Finally rewrote things so I can quickly try different fudges to the win probability function from here: https://trueskill.org/#win-probability. 1 Finally rewrote things so I can quickly try different fudges to the win probability function from here: https://trueskill.org/#win-probability.
2 \n 2 \n
3 Anyway, I figured I'd concentrate on the way that the team skill delta was calculated (delta_mu), as that felt the most tangible to me and also logically something that will vary between different kinds of games (chess, football, zero-k etc). After trying various fudges, I found that a simple mean of the individual player scores got a far better score than the sum. Even a small factor (0.9 or 1.1x) away from the mean reduced the score. 3 Anyway, I figured I'd concentrate on the way that the team skill delta was calculated (delta_mu), as that felt the most tangible to me and also logically something that will vary between different kinds of games (chess, football, zero-k etc). After trying various fudges, I found that a simple mean of the individual player scores got a far better score than the sum. Even a small factor (0.9 or 1.1x) away from the mean reduced the score.
4 \n 4 \n
5 Example, 2v2-4v4 games, ranking from all games: 5 Example, 2v2-4v4 games, ranking from [s]all games[/s] ( edit: oops, I meant 2v2-4v4 games) :
6 \n 6 \n
7 ||win_probability function||score|| 7 ||win_probability function||score||
8 ||base||-0.0181|| 8 ||base||-0.0181||
9 ||my fudge (`(p + 0.5)/2`)||0.0297|| 9 ||my fudge (`(p + 0.5)/2`)||0.0297||
10 ||@Brackman's suggestion from https://zero-k.info/Forum/Post/250763#250763||score: -0.0862 (1)|| 10 ||@Brackman's suggestion from https://zero-k.info/Forum/Post/250763#250763||score: -0.0862 (1)||
11 ||delta mu from mean||0.0315|| 11 ||delta mu from mean||0.0315||
12 ||0.9x delta mu from mean||0.0312|| 12 ||0.9x delta mu from mean||0.0312||
13 ||1.1x delta mu from mean||0.0313|| 13 ||1.1x delta mu from mean||0.0313||
14 \n 14 \n
15 (1) should be *4? that gives 0.0222 15 (1) should be *4? that gives 0.0222
16 \n 16 \n
17 The delta mu from mean peak is so tightly centered around the exact mean that it can't be a coincidence. It seems likely to be a property either of the win_probability function or the scoring function? This seems to work very well for 1v1 games too, increasing the score from about 0.1 to 0.14. I haven't checked anything else yet. 17 The delta mu from mean peak is so tightly centered around the exact mean that it can't be a coincidence. It seems likely to be a property either of the win_probability function or the scoring function? This seems to work very well for 1v1 games too, increasing the score from about 0.1 to 0.14. I haven't checked anything else yet.
18 \n 18 \n
19 None of these changes affect the success rate of the function. 19 None of these changes affect the success rate of the function.
20 \n 20 \n
21 For clarity my delta mu from mean function below: 21 For clarity my delta mu from mean function below:
22 \n 22 \n
23 ``` 23 ```
24 delta_mu = mean(r.mu for r in team1) - mean(r.mu for r in team2) 24 delta_mu = mean(r.mu for r in team1) - mean(r.mu for r in team2)
25 sum_sigma = sum(r.sigma ** 2 for r in itertools.chain(team1, team2)) 25 sum_sigma = sum(r.sigma ** 2 for r in itertools.chain(team1, team2))
26 size = len(team1) + len(team2) 26 size = len(team1) + len(team2)
27 denom = math.sqrt(size * (ts.beta ** 2) + sum_sigma) 27 denom = math.sqrt(size * (ts.beta ** 2) + sum_sigma)
28 return ts.cdf(delta_mu / denom) 28 return ts.cdf(delta_mu / denom)
29 ``` 29 ```