Loading...
  OR  Zero-K Name:    Password:   

Balancer broken?

41 posts, 1528 views
Post comment
Filter:    Player:  
Page of 3 (41 records)
sort
So... are you saying each player should have an elo value for the pairing with each other player (in the extreme case)? So for 1000 players there would be 1 million values?

This system might have some benefits for a small player base like ours where the same people are playing together constantly. But if the only balance information you're using is "how much did player x win with player y" the system falls apart as soon as x and y don't see each other more than a handful of times.
In other words: "Training" the system (as in "finding everybody's elo") will take a massive amount of games (well, unless we keep at it with 12v12). Not to mention that you're giving up on some measure of "general competence" (what elo currently represents) in favor of measuring only the player synergy.

The idea is cool and might actually have potential in the current scenario, but is that what we want to settle on?
+0 / -0


8 years ago
My idea was to incorporate (though I don't know the first thing of how to do it) synergy into the formula. Obviously strong synergy (such as found between sprung and brofessional) will have some significant impact on the outcome AND also the *enjoyability factor* (<-- MEGA ULTRA IMPORTANT) of a game while the inverse will also be true.

I don't think that general competence is enough to produce well-balanced and enjoyable games. Synergy has some part in the formula (perhaps as a favorable weight for two players?) but maybe using it as the entire formula is a bad idea for the reasons you described. I do think that general competence is a major part of it, but we should also find a way to incorperate synergy into the equation as well. Clans do work somewhat but there are some people who work really well together in two different clans.

Idea: maybe synergy could be some sort of weight that starts off at 1 and climbs/falls to some maximum influence and works as a modifier to the average general competency of two players?

Ideally the balancer should take synergy into consideration (and even reward it by placing players who work together well into the same team!) to not only produce better balances, but also more rewarding games for both sides. If one of the math wizards could find a way to do this, that'd be great.
+0 / -0
My 2^8-th post: If I wanted to consider synergy, I would not store values for every pair of players, but give every player an elo for his cooperation with low elo players and one for his cooperation with high elo players and then always use a weighted average of both, where the weigthings depend on the average elo of the rest of the team.[Spoiler]But actually I think that this is too complicated for little use. What I would really do is giving every player a small teams elo (maybe including 1v1) and a big teams elo and then always use a weighted average of both depending on team size.[Spoiler]
GBrankTheEloIsALie is right that my rather new formulas in this thread don't consider elo deviation and don't make a difference as long as elo difference is 0, but also quoted my old formula that does consider it. Probably minimizing deviation is just too complicated. I guess the computational expensiveness is rather due to going through all player compositions than calculating the objective function. It can be drastically reduced by using a Markov chain (a powerfull method to calculate NP-problems efficiently, maybe sacrificing too much accuracy for this relatively simple problem. It only makes sense for very high player numbers, otherwise use bruteforce): [Spoiler]
There is a much newer system that does consider deviation without an additional formalism, the "teamstrength system". PLrankAdminSprung had the initial idea and I formalized it. It is currently my favourite system. But it gives teams with high elo deviation better win chances. Balancing with it does not increase nor decrease deviations! There are 2 ways to apply this system to team FFA. The easier, but more different from the current system, makes it more difficult to consider deviation. The principles of teamstrength, minimizing deviation, using Markov chains for high player numbers and making win chances more distinct in dependence of player number can all be combined to create an ultimative system.[Spoiler]
+2 / -0
8 years ago
I have some further ideas:
1. Maybe there is only so high deviation in big teams currently, because the balancer goes through those constellations first and then stops, because elo difference is close enough to 0. Then using Markov chains with current system would already reduce computation time and deviation drastically.

2. When using Markov chains with deviation, the first steps could be done without deviation to make full use of GBrankTheEloIsALie 's simple substitution principle and save computation time. The next steps could be done with 1-deviation and then with 2-deviation.

3. If practical tests reveal that Markov chains only become profitable at unrealistically high player numbers, but the player number is still high enough that brute force is too expensive, only bruteforce constellations that are similar to the above constructed start constellation. For low player numbers, bruteforce all. Among those who are best without deviation calculate E with deviation and choose the best. This would already reduce deviation and computation time a lot.[Spoiler]
+0 / -0
8 years ago
Now I have found out that my 2 years old team FFA generalization of the term to be minimized for balancing that was quoted by GBrankTheEloIsALie does not correctly correspond to my correct team FFA prediction. The formula does try to give every of the N teams a win chance of 1/N, but calculates the difference from it in a strange way, because it uses elo differences instead of probability differences, which makes a difference for team FFA. If p_k is team number k's win chance according to my team FFA prediction, Var_k team k's elo variance and teamsize_k the number of players in team k, then the term to be minimized for balance is
 E = sum from k=1 to N (teamsize_k*(1/N - p_k)² + c*Var_k) 
[Spoiler]c is how high variance minimization is weighted against win chance equalization and should be around 1/10000 (or lower). Note that even with c=0 this formula is still better than current balancing, because it considers team FFA.
[Spoiler]
Things I still don't know:
- Are all team constellations tested currently or only until elo difference is small enough? Or more generally: Where is the code for !balance?
- Should the sum of team variances or the sum of variance differences of pairs of teams or no variance at all be minimized?
+0 / -0

8 years ago
The balancer code is here. Notes:
  • it checks all combinations
  • disregard all the stdDev comments, it uses plain avgElo difference; for FFA, it only considers the two extreme teams
  • max team size is ceil(people/teams)
  • a "balance item" is a group of people which holds {sumElo, peopleCount}; normally it's {personalElo, 1} per person but a clan is treated as a singular group under clan balance
+1 / -0
Well, what do you do when you want to solve an NP problem, your objective function is slow and your search space too large? You throw heuristics at it.

Genetic algorithms:
[Spoiler]

Simulated annealing:
[Spoiler]
+3 / -0

8 years ago
Is the objective function slow?

Is the problem being exponential in difficulty even relevant for the values of N we see in practice?
+1 / -0

8 years ago
Current brute force is good.

In larger games it starts to become slow enough for the hard cap to kick in but that's not a problem. Ratings are very vague at that density anyway and an unranked room doesn't need perfect balance.
+1 / -0
I did forget to mention that the brute force is going to be less of a hassle for smaller games where the amount of combinations is still manageable.

quote:
Is the objective function slow?

Slow enough that the current (much easier one) runs into its limits for brute forcing higher player numbers, as PLrankAdminSprung said.

quote:
In larger games [the objective function] starts to become slow enough for the hard cap to kick in but that's not a problem.

Are you talking about the current one or the proposed one? There's a difference of about one to two orders of magnitude in flops between them... Again, keep in mind that currently you have a low fixed cost per combination tried, whereas the new formula takes longer the more players there are...
+0 / -0
I am of the opinion that improving how the algorithm chooses permutations to check the balance of is a far more productive line of optimisation than improving how fast each permutation is checked, which is I suppose what you have said.

Something as sophisticated as GAs or simulated annealing seems like overkill but I guess it would get the job done.

EDIT - I believe the following algorithm will yield a good starting point:

1) Order the players by rating.
2) Pair the players off (#1 with #2, #3 with #4, #5 with #6, etc.)
3) Order the pairs by the difference in their rating.
4) Take the pair with the highest difference in rating and allocate them to separate teams.
5) Take the pair with the next highest difference in rating and allocate them to separate teams such that the difference in the current teams' average skill is minimised.
6) Return to (5) until all pairs are allocated.

This solution will typically not be too bad (since we use the highest-difference pairs to offset one another first, they cannot ambush us later) and has the desirable property that no team is overloaded with players of high or low skill (since each consecutive pair is allocated to a different team). Searching the neighbourhood of that point by swapping players according to some heuristic (which I think need not be complicated, preferring to make small swaps in the correct direction might be enough) ought to end up with a reasonable solution in any reasonable situation, assuming the above algorithm hasn't already found one.

Happily, (based on a bit of test code) I think this algorithm will tend to perform best for large numbers of players - the larger the game, the less swapping will be necessary to end up with an adequate solution.

Notes:
[Spoiler]

I ran my proof-of-concept code on the example game in the OP and ended up with the following teams:

1988 1868
1852 1774
1681 1673
1665 1623
1587 1588
1494 1580
1392 1492
1315 1332
1236 1293
1202 1184

The difference in the average ELOs of the two teams is 0.25. Swapping the 1492 with the 1494 (an insignificant change in terms of the distribution of elo within the teams) reduces this to 0.05. The latter solution has the same average-ELO difference as the one in the OP, but presumably the brute force algorithm found the "bad" one first (20-choose-10 is still a fair bit less than two million I believe).

(Another note: the problem in the OP was partially caused by Sprung and Brofessional being the two highest-rate players and in the same clan.)
+2 / -0


8 years ago
AUrankAdminAquanim your algorithm (or any algoritm that assumes pairs) wont work well in certain cases. For example:
1900 1600 1600 1600 1500 1500 1500 1500
+0 / -0
Indeed it wouldn't. Cases like that are what the swapping afterwards if necessary is for. It particularly tends to happen when one pair has a large difference and all the others have a very small one.

(Also, most problems small enough to have a decent likelihood of being difficult in that way are also small enough to brute force. Problems large enough that a heuristic are necessary are very, very unlikely to be that bad. Furthermore, that looks like a pretty bad game to play no matter how you balance it.)
+0 / -0

8 years ago
Why is cbalance still the default in public rooms? gtfo.
+1 / -0


8 years ago
This seems like a good place to point out that there are three different issues that come up in these balance discussions. Often a discussion starts about one but then turns to another, and the original point gets lost.

  • How to search for a balanced match
  • How to identify a balanced match
  • How to select a balanced match

Those sound similar, but are different. The first is about solving the NP-complete problem of iterating through the space of all possible matches. The second is about determining the win likelihood of any given match. The third is about choosing the best match out of all candidate matches given not only the win likelihood but also potentially other factors, such as perceived fairness.

The Elo calculation and modification methods are yet another discussion, although they're related to the second point above.




AUrankAdminAquanim, PLrankAdminSprung, and GBrankTheEloIsALie have just now been talking about the first problem. I don't think it's a particularly important problem to solve, but certainly don't object to anyone who wants to code a different approach. My view is that any search algorithm which probably maybe comes close to finding something that might be in the neighborhood of optimum is going to be good enough for our purposes.

DErankBrackman and others usually focus on the second problem. Right now the balancer is simplistic, and it assumes the win likelihood is determined by the average Elo of the two teams (I think, right?). Brackman has proposed a few different ways to calculate the win likelihood. He says that they would produce a better-performing !predict result than the current method, and he says he has the numbers to back him up. I'll take his word for it, but unless it makes much better predictions than the current method (meaning that his methods identify a way in which the current balancer systematically creates teams where one side has an identifiable and substantial advantage that is greater than the current balancer's predicted advantage), I wouldn't spend a lot of time implementing it.

I think the real issue that is worthy of discussion - focused, on-topic discussion - is what UArankdahn raised in his original post, echoed by @Rafal[ZK], CHrankAdminDeinFreund, and many others elsewhere on the forum. That issue is this: even assuming that !predict is correct and that the teams as matched by the balancer both have close-to-equal chances to win, the balancer sometimes produces matches which seem unfair, where "unfair" here means not merely a lopsided chance to win, but perhaps more importantly a lopsided chance to enjoy the game due to one side having a much higher number of much lower-skilled players.

Players have complained about this time and time again in the forum. And not just the trolls who like to complain about playing with newbies the way that other people like to drink beer. Reasonable and respectful players have complained. And they're not just complaining about playing with newbies, they're complaining about the unfairness of the matches. We should be taking these complaints seriously. However, in the past, it's always been met with discussions about how the prediction algorithm shows that the teams are equally likely to win (which is not the issue!) and challenges to provide a better prediction algorithm to run against historical data (which is a correct response to claims that the prediction is off, but that's not the issue!).

Only on a few occasions has anyone suggested that perhaps some matches would be better off with a somewhat less balanced predicted win probability in exchange for an apparently-better distribution of low-skilled players so as to not constantly create unenjoyable games where the high-skill players are all on one side along with all the low-skilled players, effectively playing 1v2 against the medium-skilled players.

I think this point deserves a lot more discussion than it's gotten so far.
+1 / -0
Skasi
8 years ago
quote:
Why is cbalance still the default in public rooms? gtfo.

Why wouldn't it? There's currently no other way for players who want to play together in the public room to do so reliably, unless you count !forcestart which can easily lead to ruined games.
+1 / -0
quote:
Only on a few occasions has anyone suggested that perhaps some matches would be better off with a somewhat less balanced predicted win probability in exchange for an apparently-better distribution of low-skilled players so as to not constantly create unenjoyable games where the high-skill players are all on one side along with all the low-skilled players, effectively playing 1v2 against the medium-skilled players.

I think this point deserves a lot more discussion than it's gotten so far.


That's actually what I've been talking about in the entire first page of the thread... The need to introduce a tradeoff (= new formula) between elo difference between teams and intra-team-elo-variance.
Then we diverged on how DErankBrackman's formula (or any such formula, really) makes the brute forcing a lot slower, and then we got to common optimization techniques and AUrankAdminAquanim's algorithm.

About that:
Yes, a decent starting guess and some sort of short hillclimbing (i.e. try swapping random players, accept it if the solution got better, stop if it didn't succeed x times in a row) should work pretty well. But the random swapping might increase elo variance again, potentially to the point where it's back to the old problematic cases.

PS: I decided to stop caring about the apostrophes breaking player links. Blame the one who broke it...
+1 / -0
quote:
But the random swapping might increase elo variance again, potentially to the point where it's back to the old problematic cases.

I tried a very simple descent algorithm (take the smallest step in the right direction possible, stop if there is no such step) on random data (continuous distribution, might not represent real elo data I suppose) and found that for medium-to-large problems (where the original solution isn't good enough, which is itself rare for large problems) it converges to a good-enough solution quite rapidly, without disturbing the elo variance much.

In general this depends on how much random swapping one permits oneself.

quote:
That issue is this: even assuming that !predict is correct and that the teams as matched by the balancer both have close-to-equal chances to win, the balancer sometimes produces matches which seem unfair, where "unfair" here means not merely a lopsided chance to win, but perhaps more importantly a lopsided chance to enjoy the game due to one side having a much higher number of much lower-skilled players.

I don't know that there's even a discussion to be had here. I am not sure that one-team-has-high-elo-variance games are imbalanced (and if they were I don't know which way) but I am sure that they are unfun and to be avoided where possible. What else is there to say?

The difficult part is how to accomplish that without compromising anything else.
+0 / -0
Indeed there are those 3 issues USrankCrazyEddie explained.

Imo the question whether elo variance should also be minimized is not answered yet. It's a part of the 3rd question. The elo variance minimization weighting in the balance function to be minimized would be so low that it usually doesn't make a difference for small games and good balance is still possible in big games. The higher calculation time could be compensated by a better way to iterate through combinations. But I think other issues are more important now than variance consideration.

For small and medium games brute force should still be used. The 2 million limit will not even be reached for 10v10. Also note that the current recursion tests if all teams are smaller or = maxTeamSize. Thus it only really calculates less than n^N combinations, where n is the number of all players and N the number of all teams. Afaict the number of combinations checked by the current recursion is
(N choose X) * (product from k=0 to N-X-1 ((n-k*t) choose t)) * (product from k=0 to X-1 ((n-(N-X)*t-k*T) choose T)),
where
X := n mod N
t := floor(n/N)
T := ceil(n/N). 
(!cbalance checks less if there are ppl from the same clan.) For N=2 this is n choose floor(n/2) for even teams and 2 times that for uneven teams. The theoretical minimum for all combinations to be checked is a factor N! less. I know an easy way how to reduce it at least by a factor N: The first player is always put in the first team.

AUrankAdminAquanim your calculation of a start combination seems to be better than mine. Your steps are not enough, though. There must be a small probability to make a bad step ("tunnel effect") which increases the function E that should be minimized. Otherwise you will probably get stuck in local minima and never find global minimum. Always propose a random step. Accept it if it's better. Otherwise only do so with a probability of
  exp(constant*(old E - new E)). 
Stop if E is smaller than a certain limit. Or if you have found good combinations also check if they are also good in terms of variance. Your system can also be applied on playerstrength and teamstrength.

Imo the most important thing to do now is considering FFA, probably with a teamstrength system.
+0 / -0
quote:
There must be a small probability to make a bad step ("tunnel effect") which increases the function E that should be minimized. Otherwise you will probably get stuck in local minima and never find global minimum.

Yeah, the descent algorithm was more for the purposes of proof-of-concept than for being a finished product.

A question to ask is whether a global minimum (in terms of the objective of elo balance) is even the most desirable outcome. In a 10v10 game a difference of 1 or 2 in average elo will make very little difference, but if the 2-difference game is closer to the original point (and therefore probably the elo distributions on each team are more similar) it might be a preferable game to play.
+1 / -0
Page of 3 (41 records)