I have read the paper that explains the whole history rating system that zero k has adopted. While I have my own complaints about the system, the major annoyance I have is nothing tracks how much rating you gain/lose after every game. To my knowelege the only way to do this is to keep track of your mmr before and after each game. I am wondering what other peoples opinions on the current mmr system are and if there is a way to see how much mmr is gained and lost after each game.
+0 / -1
|
It would not make much sense to show rating change per game, since rating can change when you don't play. My rating recently went up by maybe 50-100 points when I played no MM games for about a week, presumably because some of my opponents (or their opponents, etc.) increased their ratings by winning a lot during that time.
+2 / -0
|
The critiques of myself and others generally boil down to the MMR system being written for mathematical accuracy instead of in a way that makes sense for general humans. We can apply this generalization here. You don't see meaningful changes after individual games because changes on a short timescale tend to be meaningless. The MMR system doesn't care that humans can derive some enjoyment or meaning by, essentially, indulging in the gamblers fallacy or Pavlovian conditioning. I consider attempts to add such feedback to have failed because the underlying system does not contain any human0-meaningful concept of post-game MMR adjustment. You can see rank changes under "Show Winners" but they don't say much. For example, some of your opponents in this game http://zero-k.info/Battles/Detail/538481?ShowWinners=True gained rank while others lost it.
+2 / -0
|
You don't directly gain/lose rating depending on games. As cathartes says, it's pointless to show absolute changes as rating is simply a function that is fit to best describe your past games. Your fit can be completely different once your previous enemies play more games. What exactly do you want the system to show you? For example you could've been losing against a smurf and using your way of tracking rating before/after each match, you would see a huge rating loss. Then later, while you don't even play any games, you would suddently gain a lot of rating only because the system realized the smurf is actually a good player. The games of your enemies impact your rating just as much as your own games. The history that is shown on the Rating statistics page is automatically updated to reflect these adjustments. So while it would display the fact that you didn't actually lose much rating to the smurf correctly, it would also neglect the fact that you had temporarily lost some rating. I've implemented a way to track rating changes after each match like you do by manually checking ratings. That's what GoogleFrog is talking about. This is only for debugging really and I'd rather remove it completely (right now it's only visible to admins), because it gets misinterpreted all the time. The changes you see from one game to the next are often heavily skewed by the actions of your enemies. If it turns out some of your previous victories were easier than expected, the next victory can bring a relative rating loss, even though the game itself obviously couldn't lower your rating. This is especially visible in team games, which by themselves have nearly no instant impact on your rating. [Spoiler]Hardly anyone's rating change here was caused by the game itself. Otherwise there would be no way for people to lose rating over a victory or gain on a loss.
+1 / -0
|
There's the graphs which show rating change over time which is probably more useful than per-game changes: https://zero-k.info/Charts/Ratings?RatingCategory=2&UserId=37733&From=04%2F01%2F2008&To=05%2F22%2F2018
+0 / -0
|
What you'll see on these graphs though, is that your rating is pretty much a constant value. This is due to your short playing time. The system intentionally doesn't model skill changes in periods as short as the time since steam release. The performance of most steam players will still vary wildly between each game, and trying to model this (noise) would make the system much less accurate. So instead of modeling a growth from day 1 until now, the system mostly just averages all your games to get a more reliable rating. For an example of how this can look after a few years of play, when the system has enough data to properly model growth, try this.
+2 / -0
|
|
quote: What you'll see on these graphs though, is that your rating is pretty much a constant value. This is due to your short playing time. |
Unless you've been here for way too long. Then you can see important trends.
+8 / -0
|
Do older games get less weight over time or is your first game still as relevant as your last one?
+0 / -0
|
You could probably imagine the rating curve as a moving average, where games on the evaluated point are weighted fully and the ones further away are weighted with 1/sqrt(days).
+0 / -0
|
Can we compare old and new rating system? For example, I expect to see an increasing duration time for balanced games with the new system ("Balanced games" are games within a defined win/loss rate range, ex.: [45%-55%]).
+1 / -0
|
its pointless for teams with trolls, Slaab one decides to do trollshit all team is screwed xD and rating has nothing to do with it...
+2 / -0
|
GoogleFrogquote:
The critiques of myself and others generally boil down to the MMR system being written for mathematical accuracy instead of in a way that makes sense for general humans. We can apply this generalization here.
You don't see meaningful changes after individual games because changes on a short timescale tend to be meaningless. The MMR system doesn't care that humans can derive some enjoyment or meaning by, essentially, indulging in the gamblers fallacy or Pavlovian conditioning. I consider attempts to add such feedback to have failed because the underlying system does not contain any human0-meaningful concept of post-game MMR adjustment.
|
Seems fixable. Possible solution:
-
keep current system as a hidden elo, or true elo
-
make up fake elo which is visible instead hidden elo which has few simple rules:
- never loose virtual skill number on victory - never gain virtual skill number on loss - adjust virtual skill number change according to it's relative position with true elo to make them converge over time, like smallest possible change: +/-5, greatest possible change, +/-50 This way we lie to ourselves with new system so we keep our sanity and everyone sees intuitive number. Examples:
-
your true elo: 1800, your virutal skill number: 1600, you win: you gain around 40-50 virtual skill number
-
your true elo: 1600, your virutal skill number: 1800, you win: you gain 5 virtual skill number
-
your true elo: 1700, your virutal skill number: 1700, you win: you gain your true elo difference, with minimum of 5.
-
your true elo: 1600, your virutal skill number: 1800, you loose: you loose 40-50 virtual skill number
-
your true elo: 1800, your virutal skill number: 1600, you loose: you loose 5 virtual skill number
You could adjust limits and quantities to make them converge slower. There's a math formula which describes the relative change left to make up depending on how would you like it to converge. Limit of base(25?)+sqrt(difference) seems like one way to go. I'm not sure how dense WHR system is. Possibly changes should be around +/- 1/10 not 5/50 and scaled up with size of teams in game. btw, [GBC]1v0ry_k1ng it's cool you are still with zk after so many years and little time to play :)
+3 / -0
|
A rating system that is purely cosmetic and separate from the true rating system used for actual matchmaking is balancing may make sense for seasonal leagues. Like, every month (or several month), obliterate the visible ladder entirely and see who climbs higher. Give some badge to the winner of each league. I don't think it makes sense for a long-term ranking system of any sort.
+0 / -0
|
Why do you think so? It's meant to converge with true rating so it's not really that separate. I don't see why purging all ratings seasonally is necessary for this to be a good idea.
+0 / -0
|
@zenfur's idea sounds good. Seasonal ladder resets sound good as well but separately from zenfur's idea. I think these should just use plain old Elo if they exist.
+0 / -0
|
Rating for humen is here: Most people don't understand elo. Even less people understand WHR. Neither is user friendly, but feel free to develop your own rating system that is understandable by a slightly bigger minority.
+3 / -0
|
I don't think your feel free call is sincere. I think you did great job improving system and I want to keep it. The only regrets I had was that it was changing when not playing (seemingly unintuitive, you could loose points after victory) and that plotting it is deprived of meaning due to moving average part - it does not plot true progression or scale is messed up. The intuition people have about Elo are simple and usually aligned with deeper understanding. The higher the better you are and you are rewarded points for winning, it does not change when you don't play, similar as score in game. That's what I think that could be better addressed. Don't get me started about rank colors being intuitive for hoomans... There's plenty of examples on Fail/Funny thread. Also I 'my not sure it's good idea either, but I wanted to discuss it. @edit Personally I think we should pay less attention to score or whr and focus more on having fun and enjoying the game :) That's another subject though. I'd be happier if I could disable whole show my whr or top 50 ladder list.
+0 / -0
|
|
I'd say @zenfur's idea was worth discussing but I would not introduce arbitrary distortions between WHR and visible rating now that we already have the nice WHR system. Also, it was clear that the steam release will lead to an increase of the veterans' ratings while reducing the new players' ratings to keep the average constant. Of course, the same would have happened with elo, but only for the players who were actually playing after the steam release. With WHR, this effect also nicely translates to the players who were not playing since the release. I had expected this effect, so WHR also works as expected intuitively in this regard. Furthermore, I think that the current WHR standard deviation cutoff to translate between real WHR and visible WHR should be increased from 20 to 25.
+0 / -0
|