Ladder thought

PRO_Dregs

4 years ago
(edited 4 years ago)

Last night after a few games with

RyMarq I made a passing comment that @Godde and

Manu12's positions on the ladder were protected in some way, thanks to WHR inflation. Reflecting on my words and the games I had with

RyMarq, I figured out the depths of what I meant.

In my 4 games with

RyMarq, I won 3 and he won 1. This netted me a -50 Current rating. From a WHR perspective, it's not worth me playing against this guy. I'll never continue to progress up the ladder. This is in part due to my own WHR being inflated. But then I thought more about the situation. Because their WHR is so high, @Godde /

Manu12 likely don't even often get matched up with players in the WHR bracket that can substantially diminish theirs. Do they have an equivalent of

RyMarq? Is that me, randy or izi? Has WHR inflation made it too much of a risk for them to play against us, thus leaving us forced to take matches from potential WHR assassins and inevitably never take their protected spot?

I watched my CR go up past 3300 this week. It was higher than @Godde's, especially after me and randy gave him a night of manhandling. But in the back of my mind, I knew it didn't mean anything because my ladder rating was barely moving an inch and @Godde's ladder and CR were barely moving down facing us. Though, that's a seperate observation.

I wonder if

DeinFreund has anything to say on the nature of the issue. The greater the WHR inflation has become, the more it has undermined the matchmaking algorithm, preventing the #1/#2 spots (roughly 200 LR higher than #3) from fighting incredibly skilled up-and-comers from the top 20 who provide great risk to me, but none to them through omission.

Edit: Another way of wording it is that people in and around my position (randy got beat by rymarq too recently) are prone to matches that are against very skilled players who haven't yet "benefitted" from inflation, and our ratings get toasted when this happens, but the top positions are denied this "inverse opportunity" as much.

Inb4 "just don't lose to these players". Unrealistic, people get better before their WHR catches up and with all the riot-raiders being questionably OP these days, it's really easy to get taken by surprise.

+0 / -0

[Fx]Drone

4 years ago

its almost like the old ladder- using elo, split into 1v1 and teams- was superior in every way

+0 / -0

Brackman

4 years ago

This plot would be interesting to see. If there are deviations, then my guess would be that the values are a bit closer to 50% for high absolute values of skill difference, but only the data can tell that. If so, the current win chance calculation can be replaced by a linear combination of about 2 logisitc and/or Cauchy–Lorentz cumulative distributions with different broadening which is equivalent to a neural network with 2 neurons in 1 layer. The number of free parameters can be reduced to 2*number of neurons minus 2 by making the coefficient sum = 1 and normalizing the broadening.

There are different levels with respect to how dynamic such a fit function or neural network could be: The parameters can be calculated once. They can be updated regularly. They can be included in the WHR calculation like players' ratings. By having a good initial guess, not more Newton steps would be needed, but the single steps would become more expensive. I think it is not necessary to go further like making different parameters for different players, adding many neurons or layers or making the neuron structure dynamic.

+0 / -0

izirayd

4 years ago

when the Godde was rusty, he came out from 3100 elo to 3400 without any problems and he got all the players

+0 / -0

DeinFreund

4 years ago

esainane could you export the data of player rating difference vs victory (bool) you have now? I could make an API for you if you want. It costs absolutely nothing to get WHR data points, you can have 20k in a single request if you can fit the battle IDs.

+0 / -0

GoogleFrog

4 years ago

DeinFreund and

Brackman how about a simple solution rather than an increasingly complex system? For example, we could just declare that players in the top 20 are always able to match against each other.

+0 / -0

DeinFreund

4 years ago

esainane

curl -X POST -H "Content-Type: application/json" --data "{'battleIds': [953624, 953536]}" http://test.zero-k.info/api/whr/battles

Coming to live soon.

+1 / -0

DeinFreund

4 years ago

If @Godde agrees, you two can have a rating reset and have at it. Then we can check in a month just how bad WHR's prediction was.

+1 / -0

CrazyEddie

4 years ago

quote:
From a WHR perspective, it's not worth me playing against this guy. I'll never continue to progress up the ladder.

quote:
Because their WHR is so high, Godde / Manu12 likely don't even often get matched up with players in the WHR bracket that can substantially diminish theirs. Do they have an equivalent of RyMarq? Is that me, randy or izi? Has WHR inflation made it too much of a risk for them to play against us, thus leaving us forced to take matches from potential WHR assassins and inevitably never take their protected spot?

Rating is a measurement, not a competition. It's not a question of whether it's fair for one player to lose so many or so few rating points given the results of a match, it's only a question of whether it's accurate.

The WHR ranking list probably shouldn't be called a "ladder", since that normally connotes a competitive arrangement akin to an ongoing tournament, where one's place on the ladder can be modestly out of proportion to one's skill, and where placement is expected to fluctuate on a fairly regular basis simply due to variance. If we established an actual competitive ladder and de-emphasized the WHR ratings (I object to concealing them entirely, but I see no reason to object to making them non-obvious and not prominently displayed) then maybe we'll get fewer people obsessing over something they can't easily control and let them direct their competitive urges towards something that's much easier to grasp, i.e. their position on a tournament ladder.

WHR should still be used for matchmaking, of course, perhaps modified or informed by

DeinFreund 's and

Brackman 's ruminations on the limits of its accuracy (which, I suspect, are generally negligible from a practical standpoint).

+2 / -0

katastrophe

4 years ago
(edited 4 years ago)

quote:
Rating is a measurement, not a competition.

Isn`t it a measurement of how far in the competition you got?

+0 / -0

Forum index > General discussion >