Alternative balance

Page of 5 (98 records)

sort

Anarchid

11 years ago
(edited 11 years ago)

quote:
If zk grew 10-fold, it would stop being an issue simply because of segregation.

You have got the point well.

quote:
you can prove that games which springie deems balanced are not actually balanced at all

That would be correct if you had historical elo values for those games. As such you would likely be abusing your better data.

+0 / -0

Kyubey

11 years ago
(edited 11 years ago)

Anarchid i posted an algorithm that meets your criteria several times in this thread already

quote:

1. you have a list of ELO
2. sort list in ascending order
3. remove elo of clan members/2comm from list, add to array
4. fill the corresponding slots in the array with the ELO from list that is closest to the ELO of the array to form pairs for the clan/2comm
5. then remove the remaining elo from list and place in array to form the rest of the pairs
6. calculate the % win chance if the pairs fought in a 1v1 match (using team elo) and store in array
7. then just use the current method of randomly sorting the pairs until you have the least %win chance difference

+0 / -0

Anarchid

11 years ago

quote:
i posted an algorithm that meets your criteria several times in this thread already

Now calculate how much winchance accuracy decrease it causes.

+0 / -0

Kyubey

11 years ago
(edited 11 years ago)

none...

curent algorithm: (as far as i know)
sort the teams till you get a match with balanced elos
(ignoring any beter matches)

this algorithim:
pair to lower variance
sort the teams till you get a match with balanced elos

the pairing should not significantly* increase the elo distance between the 2 teams, and it is faster then curent algorithm

*it may select things like 1640 vs 1660 instead of 1650 vs 1650 if it has beter variance, but thats pretymuch the limit (curent algorightm on avarage has a 27 elo gap between 2 teams, so its well within the limits)

+0 / -0

TheEloIsALie

11 years ago

Wouldn't it be possible to grab the historic elo values from the replays? I know that's a shitton more work, but would give us more accurate data...

+0 / -0

Kyubey

11 years ago

it somehow got stored in replay files, other then that i don't think so unless you backwards calculate the elo changes that occured

+0 / -0

Anarchid

11 years ago

quote:
none...

Proof or gtfo.

+0 / -0

Kyubey

11 years ago
(edited 11 years ago)

just did proof on a sample of the 10 11v11 games

the wining teams:
of the winning teams 7 had +ve win chance values from my agorithm
mean 18% higher win chance, 25 stdev
34% 5% 35% 12% 46% 56% 4% 4%

2 had -ve and still won
-13% and -20% chance

it takes ages to process things on this tiny laptop so i didnt do the whole data set available...

method:
transpose teams into column format
sort the elo of players on teams
(team1 slot1)-(team2 slot2)
div by 7 (accurate enough for ELO difrances of less then 200)
sum of team win chance

if positive, team should win

+0 / -0

Anarchid

11 years ago

Can you put your results up somewhere?

Even stuff like 9-10 games is quite enough for first impression.

+0 / -0

Kyubey

11 years ago
(edited 11 years ago)

https://docs.google.com/spreadsheet/ccc?key=0Am0J-l_0teUrdGdUbVVfSXluTUg1ajctTnBiVWVtcmc&usp=sharing

link to the google doc

its on sheet 3
26-70 sorting and calculation
starting from row 70 is what you really want to look at

im going to try to run it on the wholse set now

+0 / -0

Kyubey

11 years ago
(edited 11 years ago)

k you cant do that many in google docs, only managed to do 73(max 255 but it froze) before i ran out of columns
avrage: +18 win chance for the winners

:D now how do you feel about gambling on some team game results?

ok i have now done the whole 888 games in the data set that i considered valid results
i culled the rest due to:
more then 200 avg ELO distance betwen teams (300 roughly, max was over 800)
less then 4v4 (500 roughly)
chickens (5)
highly anomalus values (|win chance|>150) (9 total)
(4:+150, 3:-150, did you know one game was a 4v10 (and the 10 lost))

after puting the results through excel
winning teams had on avarage 9.48 higher win chance then losers (49stdev, median 10)

of the 888 games, 537 were accurately predicted
401 had a win chance of +15 or more (i would put money on these)
252 were indeterminate (win chance of +15 to -15)
234 had a win chance of greater then -15 (i would have lost money on these)

my winnings are +167
using a +15 limmit (30% chance one team will beat the other)
45% chance to accurately predict the outcome of a game
28% indeterminate (games balanced enough that prediction would be unreliable)
26% cance for false positive

+0 / -0

Kyubey

11 years ago

the .xlsx file
https://drive.google.com/file/d/0B20J-l_0teUrUHd1cGR0RlpFNFk/edit?usp=sharing

sheet 1, calculating the win chance
sheet 2, analysis

+0 / -0

[GBC]1v0ry_k1ng

11 years ago

and how does that compare to the winnings of the existing algorythm?

I will run my own method past this when I get chance, pretty sure it will clean up!

+0 / -0

Anarchid

11 years ago

quote:
of the 888 games, 537 were accurately predicted

That's 10 percentile points better than coinflip. And you had improved elo ratings available to you, compared to what Springie had. >.>

+0 / -0

Kyubey

11 years ago
(edited 11 years ago)

vanilla
without limits, 509/901, 56% of all game outcomes +117 points
with limits:
+1000 scores 160 corect, 176 wrong
+500 scores 190 correct, 199 wrong
+250 291 correct, 237 wrong (32% accuracy, 26% wrong)
(its not any beter (actualy slightly worse) then random guessing)

my algorithm
without limits, it pridicts 537/888, 60% of all game outcomes +186 points
with limits it scores:
401 correct, 234 wrong (45% accuracy, 25% wrong)

my algorithm is better in 69 cases (and would win me more money gambling)

Anarchid
10% better then random in a system that is meant to be random is significant
also you arent using the limmits, with limmits i can accurately predict the outcome of a game 45% of the time with a high level of certainty that i will be right (if this was gambling i would be rolling in the money)

(30% of the games were too balanced to tell anything concrete from them)

also if you negatively weigh bieng wrong
(-1.5 or 2 per instance instead of -1 that is used)
vanilla can not get a positive score at all
my one breaks even at +40

+0 / -0

Brackman

11 years ago
(edited 11 years ago)

quote:
I did too much maths in this thread already.
@Hower

Well until @KingRaptor 's statistics there was no math in this thread except

Kyubey mentioning the probability relation of the elo system, a term of elo average and deviation difference to be minimized and maybe

TheEloIsALie and

Yogzototh proving some dubious statements wrong.

At first I wanted to spare you this, but maybe it's time for a little more maths.

Definitions

|N is the set of natural, |R the set of real numbers. P is the given elo distribution, a multiset (https://en.wikipedia.org/wiki/Multiset) of elos with multiplicity function I_P: |R->|N, N in |N the desired number of teams. In 1v1 and team games: N=2, in FFA: N=#P, in team FFA (general case): N in |N intersection [2, #P]. For an algorithmic solution elo vectors are better, but for mathematical theory I'm gonna use multisets, because order doesn't matter.

To be found: A tuple T=(T_1,...,T_N) of elo multisets with (multiset sum)_(k=1)^(N)(T_k) = P (every player is in exactly one team) and max_(k=1)^(N)(#T_k)-min_(l=1)^(N)(#T_l) <= 1 (maximum difference between team sizes is 1). T_k is the elo multiset of team number k.

avg(T_k):=1/#T_k * sum_(l in T_k)(l)
Var(T_k):=1/#T_k * sum_(l in T_k)((l-avg(T_k))²)

Here every player is weighted equally regardless of 2nd com. Double weighting 2nd com players would require double income share for them as

sprang said in 2 coms issue. A solution would be 1+w weighting the 2 coms player (the following terms become easier when i call it 1+w instead of w), where 0<=w<=1. Actually the ideal value for w depends on the map size, but maybe an average value can be chosen. ru(#P/N) is the maximum team size and thus the number of coms per team, where ru is the ceiling function (rounding up). Consequently the number of extra coms of team number k is ru(#P/N)-#T_k =: X_k. We can then define generalized terms for average and variance:

avg_(w, X_k)(T_k):=1/(#T_k + w*X_k)*(sum_(l in T_k)(l) + w*X_k * max(T_k))
Var_(w, X_k)(T_k):=1/(#T_k + w'X_k)*(sum_(L in T_k)((l-avg_(w, X_k)(T_k))²) + w*X_k * (max(T_k)-avg_(w, X_k)(T_k))²)

Current System

Afaik the current system minimizes
sum_(k=1)^(N)(sum_(l=1)^(N)((avg_(w, X_k)(T_k)-avg_(w, X_l)(T_l))^d)).
This formula looks a bit more complicated than what I mentioned in my 1st post, because it takes team FFA into account. Similar to w, I don't know what is d here. It only makes a difference for real team FFA anyway. I guess it's 1. It should be 2.

Combined System with Deviation

quote:
But making total elo (or average per team) equal isnt he way to balance teams.
@Hower

That's why the idea with deviation came up. Looking at the statistics, it seems as if deviation is irrelevant for winning and winning teams only have higher deviation because higher avg elo causes higher deviation. As

Yogzototh said, the most important thing about balancing is how good the outcomes of battles can be predicted using statistics. The thing is that the distribution of elo among players depends on the system it is balanced with. If deviation in skill (e.g. in probability system mentioned later) gave higher losing/winning probability, but the balance system doesn't account for it, the elo distribution among all players could be shifted maybe so that high and low skill players lose/win elo while med skill players win/lose elo so that the effect of deviation in elo (distributed by this balance system) on the outcome is compensated. Whether this is true can be measured from the provided statistics, but you have to know exactly what you're doing when trying this. Anyway equal deviation can be aimed for not only for battle outcome chances, but also for symmetric playstyle.

Yogzototh I also thought about a system that checks for avg elo difference at 1st priority and for standard deviation at 2nd priority. Even though this can be calculated by algorithms with about same complexity, generally better balances can be achieved when minimizing only one value that combines all requirements. In addition to the difference between different kinds of deviations, there is a difference between minimizing the maximal team deviation, minimizing the avg team deviation and minimizing the difference between team deviations as

Yogzototh already mentioned. Necessarilliy there is a certain deviation, it should only be distributed equally among the teams (-> difference between deviations). It also matters which average you choose (average of squared values,...). The system I mentioned before would minimize (generalized for team FFA)

sum_(k=1)^(N)(sum_(l=1)^(N)((avg_(w, X_k)(T_k)-avg_(w, X_l)(T_l))² + c(sqrt(Var_(w, X_k)(T_k)) - sqrt(Var_(w, X_l)(T_l)))²)).

Here d=2. c is how high you want to rate team elo deviation equality compared to team avg elo equality and should not be chosen too high.

Clan Balance

quote:
And what about clan balance?
Flipstip

I am against clan balance. It only worsens balance. If you want it, each of the systems that maximize or minimize a certain value allow for it by leaving out the corresponding permutations in the algorithm.

Probability System

Kyubey the probability relation is a very good point! Indeed guessing ~60% right is significant. I'm not entirely sure if I'm interpreting our algorithm right, because you didn't write it in a mathematical form, but it seems there are just some "strange things" in your calculation though: You only paired every two nearby elos of the teams arbitrarily, whereas in reality everyone fights everyone or at least the expectation value must account for that. Other pairings give other results in most cases. Furthermore you added the probabilities and then subtracted 50%*number of pairs. This causes negative probabilities or probabilities>1 for very unbalanced teams. You have to divide it by the number of pairs instead.

Kyubey 's algorithm is only so effecitve, because it doesn't account for all pairs and leaves out certain permutations. The following does account for that and is thus more complex than the current system, but I hope it's not too complicated:

The probability that a player with (1v1) elo A wins a 1v1 vs a player with (1v1) elo B is (the general equation,

Kyubey added a link for a site that calculates it for certain values)
P(A->B)=1/(1+10^((B-A)/400)).
With b:=exp(ln(10)/400)=10^(1/400) we can write the probability that team number k (with elo multiset T_k) wins over team number l (with elo multiset l) as
P(T_k->T_l)=1/(#T_k * #T_l)*sum_(m in T_k)(sum_(q in T_l)(1/(1+b^(q-m)))).
With accounting for 2nd coms (generally X_k extra coms), the generalized probability is
P_(w, X_k, X_l)(T_k->T_l)=1/((#T_k + w*X_k)*(#T_l + w*X_l))*(sum_(m in T_k)(sum_(q in T_l)((1+w*X_k*delta_(m, max(T_k)))(1+w*X_l*delta_(q, max(T_l)))(1/(1+b^(q-m)))))), where delta_(i, j) is the Kronecker-Delta https://en.wikipedia.org/wiki/Kronecker_delta

This is not necessarily the real probability, but it's ok for balancing like generally in elo system. It would be interesting to know how good this equation predicts battle outcomes. But note that test calculations with current elo distribution have limited significance, because balancing with this system could change the elo distribution among players.
The term that is to be minimized is then (with single consideration of every pair of teams, generalized for team FFA)
sum_(k=1)^(N)(sum_(l=k+1)^(N)((P_(w, X_k, X_l)(T_k->T_l)-1/2)²))

Algorithms

Finding an effective algorithm for calculation is another subject. At first we have to know what balance we prefer. However most parts of the existing algorithm can be used for all systems I mentioned here.

+3 / -0

TheEloIsALie

11 years ago
(edited 11 years ago)

Thanks

Brackman for getting some things spelt out in detail that were, at times, formulated only very vaguely.
I hate to be that guy, but you missed a T_ in "(with elo multiset l)"... And it was probably not needed to use w and X_k as parameters each time for the avg and Var functions, it's clear what you mean and would make stuff more readable. But I won't complain about people being rigorous in this thread. :D

quote:
I guess it's 1. It should be 2.

I vaguely recall that the current minimization function has the 1 because it makes it easier (or rather possible) to swap players between teams without having to recalculate the entire function every single time. Which, given the amount of summations occuring in your draft, makes a world of difference...

+1 / -0

Brackman

11 years ago

I also noticed that I missed the T_. I just didn't correct it, because l is the index of the multiset and as this was only text instead of a formula, it was clearly understandable. But I'm proud of you to notice it ;).

+0 / -0

Page of 5 (98 records)

Forum index > General discussion >