Loading...
  OR  Zero-K Name:    Password:   

Balancer is awful.

22 posts, 1235 views
Post comment
Filter:    Player:  
Page of 2 (22 records)
sort
4 years ago
Why everygame do I see balancer put 1 purple with 14 bronze/red and other team is entirely silver/gold. What logic is that.

Skill needs to be balanced evenly across the team not rely on a single person to carry all team.
+3 / -0
4 years ago
I think the balancer tries to strike the best possible numeric balance between the two teams, without taking in account that a team of average players tends to be better than a team of a few strong players and a whole bunch of noobs.

Maybe a pattern like this would result in better games:
1. Best player on team 1.
2. Second best on team 2.
3. Third best on team 2.
4. Fourth best on team 1.
5. Repeat for remaining players.

You'd get roughly the same number of players at each skill level on both teams.
+0 / -0
The balancer simply tries to minimize the difference in average Elo between teams. At some point it also gave a 1% weight to balancing the skill spread among teams.

If there is a clear bias towards teams with strong players or towards teams with many average players, it should be possible to use it to predict matches. Unfortunately, I haven't seen any combination that does that better than simple Elo average.

BRrankManored Sample input for your algorithm:
1. 2000 (1)
2. 1500 (2)
3. 1500 (2)
4. 1500 (1)
5. 1500 (1)
6. 1000 (2)


2000 + 1500 + 1500 vs 1500 + 1500 + 1000?
+2 / -0
For big teams we would probably need a metric other than fairness. Just because a game is perfectly 50:50 doesn't mean it will be a good game. It might be interesting if we could also measure some game quality and try to predict that.
+3 / -0
Another strange phenomenon that can make matchmaking difficult is that many low rank players are actively hurting their team through their poor metal allocation. Low rank players can easily boost their effective elo by resigning immediately, because the metal will be better spent across the team as a whole.

Now obviously people play to enjoy the game, so new players won't just insta-resign to boost their winrate. However, in the case where a lower rank player happens to disconnect for whatever reason at the start of the game, the best player gets the extra com and the metal that would otherwise have been used poorly, is now spread out among better player.

Teams with fewer of those very new players will have a significantly better chance at winning, but this is dependent on random disconnects, which cannot be predicted by balancer.

NOTE: This will have a larger significance in smaller games, but is still applicable to larger games as well
+2 / -0
hi
+0 / -0
The problem is a simple average of ratings is not a good predictor of which team will win. Here's a suggestion to find a better predictor:
  • Provide a public database or file of the outcomes of a large number of recent team games. Each game record lists the WHR ratings for each team member on each team, and the outcome (which team won).
  • Challenge players and developers to use the database to come up with a simple algorithm that predicts outcome based on ratings. This is a function f(team1_ratings, team2_ratings) = probability team 1 wins. People might contribute several candidate algorithms for this.
  • Find the f that fits the data best, using the cross entropy loss function, and is reasonably simple.
  • Balance teams by assigning players in a way that makes f(team1_ratings, team2_ratings) as close to 0.5 as possible.
+0 / -0
4 years ago
Yes, that is what it would look like.

I agree that a metric other than statistic chance of victory would be preferable. Statistic chance of victory doesn't inform you how fun the game will be, and while fun cannot be mathematically measured, I suspect games with better skill distribution are more fun or at least less frustrating.
+1 / -0
4 years ago
Perhaps the strength of a side could be estimated by adjusting the average rating with two functions, whrStepdown and numStepdown. whrStepdown converts whr to a number that says how much strength they contribute to a team of at least 2 people. numStepdown says what the sum of contributions gets divided by, so it's not necessarily a plain average; presumably a team with more players would be stronger than a team with fewer, if the ratings were all equal.

For a team with 3 players this would look like side_strength = (whrStepdown(player1WHR) + whrStepdown(player2WHR) + whrStepdown(player3WHR)) / numStepdown(3).

whrStepdown and numStepdown would be curves with parameters that could be adjusted to give the best fit to the historical win/loss data.
+0 / -0


4 years ago
USrankBerder I think your suggestion has already occurred, albeit probably not through a nice API.
+1 / -0

4 years ago
Rating evaluation thread
https://zero-k.info/Forum/Thread/22898

Repo with a battle parser, some downloaded battles and evaluation code for Elo and WHR.
https://github.com/DeinFreund/ZKForumParser
+2 / -0
Another interesting addition to the balancer would be the calculation of synergy between specific players, i.e. how much more likely/unlikely they are to win if balanced to the same team. This could be then added as a rating modifier on top of their regular WHR.
+2 / -0

4 years ago
I tried that and it's near impossible. There are too many player combinations and too few repeated combinations, even if you just look at two player pairs.
+3 / -0
Although not possible, I dream ZK site could just let people "bet" some idea would be better than current system, and use the money for future developments.

On a serious note, please consider making the effort and try your ideas numerically. I am sure - based on all conversations I have seen - that a lot of things have been tried (DeinFreund trying a lot of ideas), but it is not that easy as it might seem to improve the system, when you consider all cases ...

EDIT: and TBH, it is not exceptional to be in a battle where I say "this is bad balance for sure my team will loose" and then the team wins (also without trolling/disconnects, etc.)...
+2 / -0

4 years ago
The balancer also sometimes gets it right when you least expect it. I've been on teams that I considered subpar, and we ended up winning handily. There's a thing that sometimes happens when you get several high elo players on a side, where they seem to all pursue troll strategies in the belief that the others will carry. They do stuff like rush eco, striders, ramps etc while the more average players expand fast and pump out units ... GG.
+0 / -0
4 years ago
So where can a list of multiplayer battles with whr + outcomes be found? Would I need to scrape the individual https://zero-k.info/Battles pages and cross reference with scraped player whr from their user pages? Or is there a more convenient way?
+0 / -0

4 years ago
quote:
The balancer simply tries to minimize the difference in average Elo between teams.


Wait, so superior numbers don't factor into it at all atm (save for the presumable constraint that the difference in numbers must be at most one)?

For example, given ratings: a, b, b, b, c with equally spaced differences (c - b = b - a > 0),
is a, c vs b, b, b (diff of averages = 0)
always considered fairer than a, b vs b, b, c (diff of averages > 0)?

I know 'a' gets an extra com, but APM are a thing...

+1 / -0


4 years ago
quote:
Low rank players can easily boost their effective elo by resigning immediately, because the metal will be better spent across the team as a whole.

Optimal play is to find the highest ranked player on your team and build caretakers next to his factory and making them assist.
Can also micro cloaked scouts (widows and gremlins) in the late game and build E grid in the mid-game.

Someone playing like that could easily get to 2000 ELO in teams if they only play in games that have purples. :D
+3 / -0
4 years ago
There are a whole bunch of high-micro-high-reward units that could benefit from dedicated newbs. Pretty much all the cloaked units with the right ball size. Gunships to a lesser extent, if you play it safe. Planes with sufficient study of this specific art.
+1 / -0
USrankBerder the repo I linked contains both a script to scrape the battle pages and already downloaded battles for analysis. It has an implementation of WHR so you can calculate the WHR ratings and predictions for every match. It's what I use to evaluate changes to WHR.
+1 / -0
Page of 2 (22 records)