Loading...

Forum index > General discussion >

!predict is wrong! - New Prediction System for Teams

34 posts, 2649 views

Post comment

Filter: Player:

Page of 2 (34 records)

sort
Brackman 9 years ago (edited 9 years ago) What is wrong? Of course I know that elo has to be recalculated chronologically in any system respectively and have planned to do so since Anarchid posted his data. I only asked some questions about it and searched for easier ways to do it. I still don't know which formula zk uses currently for weighted average and weighted elo change, but let's see. +0 / -0
Licho 9 years ago (edited 9 years ago) It's in SpringBattle.cs Each player has weight representing certainity about elo knowledge. Weight has a max value and grows fast in situations where player playes against higher weight people. Low weight reduces effective elo used in balancing. You should use entire dataset I released (Anarchid's is incomplete), because thats what ZK does/did. +0 / -0
Brackman 9 years ago I have written a C++ programm to test current and sqrt sm system's respective elo progresses and predictions without weighting. Weighting and other systems can still be added, but I didn't find this SpringBattle.cs yet. The actual formulas for weighting would also be more helpfull than hundreds of code lines. Now the problem is that it needed about 30 to 45min (20min) for 300 (100) games with my self written json parser on my wooden pc and there are about 300000 games. I assume that the costly part is not the calculation itself, but parsing json in C++ (and filtering out FFAs). Anyway current system has an average trans log score of 0.044840 (0.034239) and sqrt sm 0.045531 (0.035991) for 300 (100) games. Maybe adding weighting improves all systems by about the same amount. Maybe all systems' prediction qualities will also be approximately monotonic and convergent in the number of games. Would a test on 2000 games with weighted elo progress be enough? +0 / -0
TheEloIsALie 9 years ago SpringBattle.cs +0 / -0
Brackman 9 years ago Thank you TheEloIsALie. Fun fact: Springie2 was the best player in the first 100 games. Seriously: There must be something wrong with Licho's data. Can you remove specs and FFAs? And please either provide a file that is much easier to parse or don't change the file's formatting so that the same programm can be used. +0 / -0
TheEloIsALie 9 years ago Blind guess: Even Springie2 was affected by "noob elo penalty", and as games were "spectated" by it that penalty reduced like it does for other players. +0 / -0
Brackman 9 years ago (edited 9 years ago) Interesting guess. Probably would have been true if noob penalty was considered already. But this only comes with weighting. Account.cs has to be considered, too. The formulas for weighting are more complex than I thought. I'm not even sure yet if they leave total elo average unchanged. Does anybody know the values for EloWeightMax and EloWeightMalusFactor? They are global constants. I know enough relations so that one of both values should be enough.[Spoiler] SpringBattle.cs's line 143 can be simplified to "var eLose = 1 - eWin;" +0 / -0
Shaman 9 years ago Springiee2 OP. +0 / -0
Licho 9 years ago Brackman JSON is about the simplest format you can get for sending structured data.. there should be tons of tools to parse it, in most languages its "one-liner" .. Maybe use some higher level language? C#, python or even javascript? In C# it takes about 1s to load all data into memory. You can filter out FFA data by ignoring entries with more than 2 teams. But standard calculation DOES take FFA into account! So I see no point removing it.. I think that its acceptable to compare 2000 games if you do rerun of weighted elo and your method. Also another option is for you to simply write/modify BattleBalanceData.cs which is class intended to testing alternative balancing on live data. I could then run it directly from DB and tell you results. +0 / -0
Svatopluk 9 years ago As almost purely team player, I would like to point out few things for (im)balance: - as mentioned by Skasi odd team balanced completely changed! if you apply same rules, your results will become wrong for last 2 months, see http://zero-k.info/Forum/Thread/19720 - currently FFA elo = team elo, which can break your super-smart balancing algorithm. However often FFA players avoid team games. - you must use only ELOs of players at that time, as mentioned by Licho, because some players actually improved over time (yes, even I was 1100 elo lobster that suicided) - for even teams, current balancing is VERY GOOD! - current balancing doesn't work for small team + large map + high elo player, because e.g. in 4v4 big map I get 3 com morphers that do not expand at all = gg. So my only suggestions of improvement are: - separate FFA elo - for odd balance teams put back commander income as private Also silly idea - if CAI (or other AI) has team elo, it could easily help with odd team balance. I wouldn't mind playing with robots on team, meatbags must die! +0 / -0
Skasi 9 years ago (edited 9 years ago) bump I want dis new magic nao! Brackman give when? Ty < 3. +0 / -0
Brackman 9 years ago Firstly I found out that my more advanced probability systems are equivalent to my simple probability average system (not to current, though).[Spoiler] This system is still of another type than the systems mentioned in this thread. Here you can see the simple version. I found the equivalence while thinking about FFA calculation, because the formulas can be brought in the same form (with higher order tensors though). Secondly I have found out that applying my distinctivity modificator D(n) (which depends on the average number n of players in a team) on a probabilty p calculated by the elo system is surprisingly equivalent to a multiplication of the elo difference with D(n). Proof:[Spoiler] My distinctivity modification formula was p^D/(p^D+(1-p)^D)=1/(1+(1/p-1)^D)=f((f^-1(p))^D), where f(x)=1/(1+x) and f^-1 its inversion. You can already see that the composition of f and exp is the Fermi-function, which is the probability function of the elo system p=f(exp(eloDif/b)), b:=400/ln(10). So if we apply f^-1 on p, we get exp(eloDif/b). If we apply ^D, we get exp(D*eloDif/b). If we apply f again, we get the elo probability formula with D times eloDif. Therefore we can represent any system so far (except the above mentioned probability system) with only a elo difference multiplicator D(n). Current: D(n)=1, smes: D(n)=n(0.5+0.5ⁿ), sqrt sm: D(n)=sqrt(n) (the one I prefer due to my test results) Thirdly I have found a factor "Math.Sqrt(sumCount / 2.0)" in SpringBattle.cs. This is exactly the factor sqrt(n), because n=sumCount/(number of teams). (The code only considers 2 teams.) This irritated me. I expected this factor to be 1 so that the overall elo change due to a game is constant and only distributed to players (proportional to EloInvWeight). But this means that games with more players are considered more meaningful to elo change. Of course the average elo change of a player will still be lower in a bigger game proportional to 1/sqrt(n), but I would expect it to be 1/n. The only argument for this that comes to my mind is that games with more players tend to be longer and thus would be more meaningful, but taking other things than win or loss into account doesn't sound reasonable to me. This factor is not directly an error, but I would either use a constant factor or a dependency in (winnerInvWeight+loserInvWeight)/sumCount to make elo change higher for lower weight. So I would move the factor sqrt(n) from the elo change formula in lines 146, 147 to the elo probability formula in lines 142, 143 as an elo difference multiplicator, because games with more players are not more meaningfull, but more distinct. +0 / -0
GoogleFrog 9 years ago If you are talking about making changes to lines you should really get git and make a pull request. +1 / -0
Brackman 9 years ago Maybe I should do that.. I'm also planning to do slightly bigger changes to SpringBattle.cs to fix some bugs (XP for 1v1 and FFA calculation). But I will probably not change the sqrt(n)-factors without further consent or tests or discussion. +0 / -0

Page of 2 (34 records)