Do 1v1 games improve multiplayer balance

Page of 7 (132 records)

sort

Jadem

3 years ago

quote:
GBrankJadem here have some upvotes to validate your existence.

Cheers mush, only need a few more and it should be fully validated ;)

+3 / -1

Anarchid

3 years ago
(edited 3 years ago)

quote:
In an attempt to correct this and more accurately predict required fertilizer usage, a brilliant scientist decided to use this old dataset to determine that separate tracking of carrots, potatoes and wheat does not produce more accurate results.

quote:
- the two (actually more) disparate algorithms used to balance and predict games back when they were played actually somehow influence the new algorithm despite numbers being not used

It seems that you are indeed very convinced in precisely this, despite calling it a strawman, and that your mathematical proof has the form of an unrelated fable.

This is sad but not unexpected.

+1 / -1

TinySpider

3 years ago

Anarchid Perhaps I need to communicate this in simpler terms.

Nobody except you is talking about 2 separate algorithms. There's only 1 set of data, 1 balance algorithm.

+1 / -0

Anarchid

3 years ago
(edited 3 years ago)

Trying to figure out why it seems so difficult for you to get how this works - and clicking on

Brackman 's link from above- i found this:

quote:
I'm not talking about the specifics of how it is implemented, but rather how using old data doesn't provide a valid point for effectiveness of new implementations. Since the data is not related to it.

Would you perhaps claim that different-skilled players in a series of games balanced entirely randomly then cannot be evaluated by any ranking system, since they all attempt to be very non-random?

+2 / -0

Dave[tB]

3 years ago

Your analogy really doesn't apply here TinySpider.

My understanding of the process that goes on here is:

You have a large dataset of games, with Win-Loss Data. You compare two systems:

-One in which the 1v1 games are all set as no-elo
-Another is the contemporary system

This generates two distinct sets of ratings. Using the ratings from each set, you compare the outcome of each match with the the predicted win % for each team.
game1-game2-game3 etc
The results are:
0-1-0-1-1-1-1-1-0 etc
Set A Predicts:
0.5-0.2-0.5-0.4-0.2-0.7-0.6 etc
Set B Predicts:
0.1-0.7-0.5-0.3-0.2-0.6-0.6 etc

If I understand your point correctly you think that there is a problem with the set of games that is being used, as it is determined by the rankings of one of the data sets. Can you explain why that is exactly? I don't think your fertilizer analogy applies.

Without going into the mathematics, my intuition is that a set of games with random balance would provide maximum information, but I don't see any reason why the information that we do get is inherently flawed.

+3 / -0

TinySpider

3 years ago
(edited 3 years ago)

quote:
One in which the 1v1 games are all set as no-elo

Where can I see these team games that were played according to this new rule?

quote:
Would you perhaps claim that different-skilled players in a series of games balanced entirely randomly then cannot be evaluated by any ranking system, since they all attempt to be very non-random?

I made no such claim as that makes no sense to claim. Where do you have a dataset featuring these randomly balanced games?

+0 / -0

Dave[tB]

3 years ago
(edited 3 years ago)

Lets use the farming analogy.

If someone wins a team game give them some amount of potato, depending on their on their amount of potatoes and carrots. If they lose, cut some portion of the potatoes away. We balance teams by both carrots and potatoes.

If someone wins a 1v1, give them some amount of carrots, depending on their amount of carrots and potatoes. If they lose a 1v1, cut some portion of their carrots away.

Now we change it to:

If someone wins a team game give them some amount of potato, depending on their on their amount of potatoes. If they lose, cut some portion of the potatoes away. We balance teams by both carrots and potatoes.

If someone wins a 1v1, give them some amount of carrots, depending on their amount of carrots. If they lose a 1v1, cut some portion of their carrots away.

Do you see any way to compare the two systems, given that we can't change how the games were balanced?

+0 / -0

Anarchid

3 years ago
(edited 3 years ago)

quote:
I made no such claim as that makes no sense to claim. Where do you have a dataset featuring these randomly balanced games?

Then your claims are mutually exclusive.

- You say that it's nonsensical to claim that a randomly-balanced dataset cannot be used to evaluate players.

- You say that it's invalid to evaluate whether a subset of a dataset balanced using (elo, whr, and all games) increases or decreases predictivity of trueskill, because these games were balanced using elo and whr over all games and not using trueskill over only-teams.

You can't have both.

+0 / -1

TinySpider

3 years ago
(edited 3 years ago)

quote:
Do you see any way to compare the two systems, given that we can't change how the games were balanced?

Those are both bad systems as they mix potatoes with carrots.

quote:
You can't have both.

You can, as you're using a dataset that is biased to a certain conclusion.

+0 / -0

Anarchid

3 years ago
(edited 3 years ago)

quote:
You can, as you're using a dataset that is biased to a certain conclusion.

Wouldn't the random dataset, by this logic, be biased to reach a certain conclusion - that no skill matters?

+0 / -1

TinySpider

3 years ago

quote:
Wouldn't the random dataset, by this logic, be biased to claim that no skill matters?

At least put a modicum of effort into your trolling, the question was never whether skill matters and you're derailing the topic with your snide remarks. Skill as pertaining to this discussion is rating, rating that is determined from said data and used to create future data.

The only (there's allegedly a small sample of random games.) available set of data is one that was created with multiple combined game modes, and is therefor biased towards certain conclusions.

+1 / -0

Anarchid

3 years ago
(edited 3 years ago)

You simply have no idea what you are talking about. Or perhaps a bunch of very wrong, very stubborn ideas.

Very well, this was a waste of letters and a very good reminder to never engage you seriously ever.

+1 / -2

fiendicus_prime

3 years ago

TinySpider what is it you actually want? Of course, the analysis here cannot disprove the statement "the true multiplayer ranking of players can only be revealed with matchups that aren't in the dataset". I don't imagine anyone would dispute that. But it doesn't directly follow that we should therefore ignore parts of the dataset to get a more balanced matchup.

If you're saying, I can't prove the hypothesis without the data from these missing matchups, then you're right, but I think you'd need to provide a specific plausible mechanism before the balancing algorithm is changed to provide you the data needed.

+1 / -0

TinySpider

3 years ago
(edited 3 years ago)

For me the ideal outcome would be a complete reset of all data and separation of individual gamemodes into their own non-intersecting ladders.

I don't see how such a simple, widely adopted idea is controversial in this game.

+0 / -0

Brackman

3 years ago
(edited 3 years ago)

Here's a mathematical line of reasoning that supports

TinySpider: A system consists of a calculation system and a choice of considered games. The system that was originally used at the time produced team games with chances close to 50%. If we analyze the data with a different system, chances will deviate more from 50%. If the calculation system is inherently too close or too far away from 50% according to the used scoring rule (which may be because the scoring rule inherently overly punishes being too close or too far away from 50%), we get a scoring distortion depending on whether the system is further away from the original system.

I'm not sure if this effect really matters as long as we use a good scoring rule and a system that is good according to this scoring rule. The scoring rule is good. Now we have to get the calculation system right. To do that, we should remove the 0.5 fudge (only second half of the post) and see the results, then change denom and see the results. If they are still negative, play around with D mod. By fitting the D mod to maximize the log score, it should be possible to eliminate the effect.

+1 / -0

Anarchid

3 years ago
(edited 3 years ago)

quote:
The system that was originally used at the time produced team games with chances close to 50%. If we analyze the data with a different system, chances will deviate more from 50%. If the calculation system is inherently too close or too far away from 50% or the used scoring rule inherently overly punishes being too close or too far away from 50%, we get a scoring distortion depending on whether the system is further away from the original system.

This supports that the current dataset has a baked-in tendency for games to be somewhat balanced, and that a bias is in general possible as a thing. Although i must say that "scoring algorithm rule inherently punishes too far from 50%" is a bias in the scoring rule; and "games in this dataset tend to be 50% odds" seems to be a facet of ground truth.

However, TinySpider's claim is that the bias is towards more data increasing prediction accuracy and that it is caused by the balancer using 1v1 data for balancing, and i find that while 50% bias is quite likely, this is still preposterous nevertheless. Or at least quite extraordinary, with all the proof offered so far being "cuz i think so, also you are trolling if you disagree".

How would one go around designing a balancer with the goal of deliberately introducing such bias while it also being a purely artifactual phenomenon and not a true correlation?

Hmm. Didn't ZK have separate 1v1 vs everything else ladders at some point before WHR, split by team size? If so, then teamgames from that epoch are not balanced using 1v1 data.

+1 / -0

Brackman

3 years ago

quote:
Although i must say that "scoring algorithm rule inherently punishes too far from 50%" is a bias in the scoring rule

It could be that a scoring rule punishes deviations so much that it violates certain mathematical principles for a good scoring rule. But as long as this is not the case, there is no absolute truth about how much it should punish being far away from 50%. The same is true for a calculation system. A calculation system should fulfil certain principles but the deviation from 50% can only be good according to a scoring rule. I have corrected my post accordingly.

quote:
This supports that the current dataset has a baked-in tendency for games to be somewhat balanced, and that a bias is in general possible as a thing.

quote:
"games in this dataset tend to be 50% odds" seems to be a facet of ground truth.

Yes. We also have to distinguish between actual chances and predicted chances. Certainly, the actual chances as well as chances predicted by any halfway reasonable system will be closer to 50% in the data than in random balance. What I meant is that the data is balanced such that it appears to be well balanced for systems close to the original system while other systems will see the data as a bit less balanced.

quote:
Hmm. Didn't ZK have separate 1v1 vs everything else ladders at some point before WHR, split by team size? If so, then teamgames from that epoch are not balanced using 1v1 data.

Yes. This would mean in that epoch, including 1v1 games will move team predictions further away from 50%. The original system is also time-dependent.

As long as the calculation system uses the deviation from 50% (which is controlled by its D mod) that is ideal according to the used scoring rule, all of this should not be a problem.

+0 / -0

malric

3 years ago

quote:
Any team game on record has been balanced using 1v1 games and cannot be separated from 1v1 games,

TinySpider: I am wondering: do you think that two consecutive team games with the same team compositions on the same map will always have the same result?

Yes, would imply that balance is the thing that has most influence on game outcome and any bias inserted by balancing is essential
No, would imply that balance is not the only thing having an influence so in the end any rating system/balancer will be an approximation, so we are left with comparing "bad/incomplete systems"

I think is good to check and try to improve incrementally the system, but I do not expect any system as simple as described here (use WHR, use only those games in the rating, etc.) to describe well the complex variables that influence a game. Does a player drop by a purple or the other side of the map? Does someone had 3 beers before playing? Is someone trying a new strategy? Maybe purple players play more consistently, but there is a large mass of people playing this for fun only...

+0 / -0

GoogleFrog

3 years ago
(edited 3 years ago)

Let me have a go. I think

TinySpider has a pretty reasonable question that hasn't actually been answered. There are a lot of posts (especially from

Anarchid) that seem to assume that

TinySpider knows the answer and is stealthily arguing against it. The farmer analogy isn't helping either.

The question is simply:
How can some system claim to improve balance when no actual games have been balanced with it?
As far as I can tell nobody has actually answered this question. There are plenty of posts that assume we all know an answer though, and then attempt to justify it. Even if everyone knows the answer, stating it clearly can't hurt.

The answer everyone is defending here hinges on two claims:

1. If you have a system that outputs an accurate win rate when you ask it about two teams of players, then you have all you need to make a good system for multiplayer balance.
2. To evaluate how good a system is at generating accurate win rates, all you need is to run your system on the past history of ZK games, then grade how well it did with a particular scoring system.

These two claims are how a thread called "Do 1v1 games improve multiplayer balance" can actually be about comparing two numbers generating by running an algorithm running on a list of past games. All improvement is assumed to be of the form of the score number increasing. There are reasonable arguments and counterarguments for each claim though.

The justification for claim 1 is that, given you have such a system, balancing any new game is just a matter of asking the system about every potential team, then picking the teams that get you closest to 50% win rate. This is pretty reasonable, although note the subtle shift from "good ZK game" to "ZK game with 50% win rate for each team". The system is blind to every other aspect of what makes a good ZK game, so it would be very surprising if it actually generated the best ZK games. The threads about even skill distribution within teams seem to show a practical problem.

The justification for claim 2 is based on somewhat complicated maths and most likely needs a bunch of caveats to even be true. It is probably approximately right as long as nobody tries to do anything too weird, such as supply a dataset where Team 1 always wins. But who knows. What if the existing algorithm likes to generate a particular type of games with an actual win rate of 50%, a proposed algorithm would generate different games with actual win rates of 50%, but the proposed algorithm is confused by the games produced by the existing one. I haven't seen a solid reason that this couldn't happen.

+3 / -0

TinySpider

3 years ago
(edited 3 years ago)

quote:
that seem to assume that TinySpider knows the answer and is stealthily arguing against it

quote:
How can some system claim to improve balance when no actual games have been balanced with it?

I'm not that good with words, so thank you for expressing this. I do not actually know the answer, I am just questioning how valid the conclusions are in OP given the single dataset available.

+0 / -0

Page of 7 (132 records)

Forum index > General discussion >