1v1 Matchup data for 9th to 17th of May

59 posts
I have processed around 2500 games from the 9th to the 17th to generate some matchup stats. For a game to count the participants had to be within 500 WHR, which is pretty lenient. The 'None' entry for factory means that the game concluded before the player plopped their factory.

The raw stats and processing script can be found here: https://github.com/ZeroK-RTS/SpringRTS-Tools/commit/885f6f645e97411957836e8463e8ac6bd7c78822

All Games

Battles: 2421
Factory Winrate (excluding mirror) Pick Count Mirror Matches
Cloaky 50. 22% ( 50. 30%) 1117 142
Shield 47. 59% ( 47. 24%) 456 29
Rover 50. 07% ( 50. 09%) 683 48
Hover 46. 80% ( 46. 40%) 250 14
Spider 50. 97% ( 51. 20%) 514 48
Jump 50. 60% ( 50. 63%) 419 12
Tank 57. 29% ( 59. 07%) 803 79
Amph 58. 80% ( 60. 11%) 216 14
Plane 24. 62% ( 23. 81%) 65 1
Gunship 42. 50% ( 41. 74%) 240 11
Ship 0. 00% ( 0. 00%) 1 0
None 6.41% (4.05%) 78 2


1500 Minimum WHR

Battles: 1424
Factory Winrate (excluding mirror) Pick Count Mirror Matches
Cloaky 51. 57% ( 52. 05%) 574 67
Shield 46. 77% ( 46. 19%) 263 20
Rover 47. 82% ( 47. 29%) 435 42
Hover 45. 64% ( 44. 88%) 149 11
Spider 49. 51% ( 49. 38%) 305 32
Jump 49. 82% ( 49. 81%) 275 9
Tank 52. 72% ( 53. 52%) 552 63
Amph 59. 77% ( 61. 33%) 174 12
Plane 40. 00% ( 38. 46%) 15 1
Gunship 40. 40% ( 39. 56%) 99 4
None 0. 00% ( 0. 00%) 7 0


2000 Minimum WHR

Battles: 389
Factory Winrate (excluding mirror) Pick Count Mirror Matches
Cloaky 51. 85% ( 52. 50%) 162 21
Shield 55. 56% ( 56. 25%) 54 3
Rover 48. 48% ( 48. 00%) 132 16
Hover 51. 28% ( 51. 72%) 39 5
Spider 48. 39% ( 48. 00%) 62 6
Jump 50. 62% ( 50. 65%) 81 2
Tank 47. 70% ( 46. 88%) 174 23
Amph 52. 08% ( 52. 63%) 48 5
Plane 33. 33% ( 33. 33%) 3 0
Gunship 50. 00% ( 50. 00%) 22 1
None 0.00% (0.00%) 1 0


Initial Thoughts

The cries of 'Tank OP' are backed up by a 57% general winrate. Surprisingly, given the source of some complaints, it has a 47% winrate at high levels, the lowest land factory there. Perhaps Blitz makes the factory relatively easy at low levels but the cost of its raiders may make it punishable at high levels. Maybe the top level players have idiosyncrasies which skew the stats. Perhaps I messed up and inverted everything so someone else should take a look.

Cloaky is still the most popular factory and seem unaffected by the Glaive cost change, within uncertainty. Ampbots are still stealth-OP. This may be due to the map pool. I am quite pleased that the winrates are all quite close to 50% at the top end and are within 10% across the board for land factories. I view Gunships and Planes as optionally viable in the sense that they have important roles in teamgames and as switch factories that I do not want to screw by making them more 1v1-ploppable than is natural.

Messing with data

Here is what happens with the WHR gap bounded at 100. In general you can get many different results from modifying the bound. A better analysis would weight the result by the probability of victory based on the players rating.
+14 / -0
12 months ago
Games with 500 elo difference are completely meaningless. I find that games get very 1 sided at around a ~200 elo difference.

Its interesting to see shields have the highest win rate in 2k+. Is thuglaw really that good?
+1 / -0

12 months ago
I suspect it has more to do with rogue than thuglaw.
+1 / -0
12 months ago
Oh hey I'm the sole person who lost to a plane plop at >2000 WHR
+0 / -0

12 months ago
Great, I love it. Very appreciate.

I am concerned that 500WHR seems a little too lenient. If a difference of 150 results in an 80/20 prediction between godde and randy in this game:


Then I think we should expect very large effects of factory preference at different elos. Stuff like, if Godde happens to be heavily experimenting with something, that something is going to have inflated win-rates.

There is also a pretty big confound in the form of application of factory to maps. If cloaky is seen as the generalist fac that is a comfort pic for many, it's going to be used in areas it might not be completely optimal for. For example, some portion of these games will be cloak competing with spiders on mountainous maps, or competing with LV, tanks, and hovers on the flats. I think this may partially be the case for tanks being used on pretty hilly maps such as trojan. I'm not really sure what the map pool looks like so I may be overstating this case.

I'm wondering if there's anyway, rather than bounding to a flat number, to adjust weighting based on how much MMR was transferred? If WHR is already making win predictions, could we tap into that to make use of the larger sample while controlling for skill discrepancy?
+0 / -0

Can we see this again with a smaller elo difference? Im a little shocked at these numbers... I'm also aware that on a certain skill gap the higher player will often plop something nonsensical and end up curbstomping the other guy anyway.

With the other guy playing seriously and plopping the fac that has been working for him against other lower players and being decimated by the higher player.

Whats the formula for the !predict in zerok? And how do i run a modified script? is this just off the console ingame?
+0 / -0

12 months ago
Click the spoiler to see lower WHR differences. You can get any result you like by changing the WHR gap as you effectively get to control which games are included. Does anyone have a good reason for thinking that the effects of no WHR gap will not just cancel out?

A mathematically justified game weighting function would be nice.
+0 / -0
I was thinking of weighting it based on the predict formula zerok uses... but I dont know what it is. Do you know?

Like i said in my previous post, my reasoning for taking smaller gaps more seriously is that the players themselves tend to take the games more seriously and pick a factory to win. When theres a gap the higher player plays funky things and still wins. With so little data its quite possible to skew the rate weaker factories win simply by the kind of curbstomp skill difference does in this game.
+0 / -0
I think Amph is slowly creeping into high-level meta, i've seen amph mirrors on Adansonia which used to be a hover mirror map.

I think there are two reasons for that:

1) Grizzly is a horrifying ultrageneralist, especially when you can afford several. It roflpwns Dante, the typical 1v1 escalation device. Once you got several, there doesn't seem to be anything quite capable of fighting them efficiently especially if they are escorted with riots and AA. It also is quite efficient on attack move - nearly zero babysitting required. This all conspires to make it a very efficient "mindless spam unit" for macro maps. Each time i tried to counter them with anything else i noticed that it would have been more economical to just spam grizzlies myself.

(or nuke)

2) Archer is horrible beyond all imagining. It's not that overpowered, i think, but it's very powerful and at points very arbitrary. The removal of its tank mechanic meant that it now constantly operates at 100% of its nominal range - and that is a range that beats Reaver, Ripper and - most comparably as a super-fast riot with massive stopping power but low DPS - it outranges and outshoots Venom. It also matches the range of Scallop and Redback while having that quite raider-tier speed.

The arbitrariness comes from how its impulse mechanics unglue units from the surface. Often when two Archers fight, one will be rendered immobile while the other retains freedom of movement. Other times, multiple Archers will murder Ravagers in a fraction of a second, while every once in a while they will also completely fail to nudge them at all. Sometimes a single Archer will be able to immobilise a commander or a Grizzly.

Randomness aside, Archer has the promise of very reliably stopping raids if used competently, and it's great for escorting Grizzlies.
+3 / -0

12 months ago
The only times I see shields is on some of these dumb tiny bot maps in the pool where people will rush a handful of thugs and an outlaw into enemy base and win very quickly, it's extremely difficult to stop these small thug balls in the early game.
+1 / -0

12 months ago
Also yes I think grizzly is kind of insane. Strongest, most cost effective late game unit by far

pls dont nerf it tho
+1 / -0

12 months ago
lol I think all those plane drops are mine X )
+1 / -0
every time ive seen grizz spam ive noticed a fantastical lack of penetrators,im not sure u should call those games high level.
You counter big lazors with even bigger lazors same goes for arty :))
+3 / -0
12 months ago
On the topic of amph.

I've tried to make them work, they play sorta different from every other factory. They have the archer which is very strong (brutal vs bot facs). Amph can eco relatively safe earlygame with archer protection. At the same time they have a tough time raiding with proper defense placement. (archer range can make llt placement different than against other facs since you can sometimes hit the mex without being hit by the llt if its placed badly).

That said, they are weak to tank and rover facs in particular and early skirmishers in general. Blitzes match range with archers and kill em easily. Blitzes seem to outdo grizzlies for cost fairly easily too if theres 2 or less griz.

As far as grizzlies i think they are very powerful on high income map, they are indeed the best A move high cost unit. With micro though, snipers, penetrators, cyclops and even ultimatums do very well against them.
+0 / -0
12 months ago
I'm happy to know that the way to counter boring generalists is to use progressively more degenerate units.
+3 / -0
what do you consider "degenerate"?

The issue is that grizzly is one of the archetypes that's just right to deal with most medium and heavy units without requiring complex micro management.

Longer ranged units are generally very fragile, and many of the expensive units are gimmicky. When you build the shorter range striders like dante or scorpion you're paying for exotic properties like the powerful manual fire weapons, burn dmg-over-time big enough to condemn small raiders and skirmishers that barely touched the fire, permastun or crazy high short range dps.

It often seems more effective to cherry-pick the reliable stuff from the different factories than to try and make the exotic crap work. Do snipers really need such short sight range that implies spotters and to consistently miss radar dots due to wobble?

there's also an issue with anti air projectiles without tracking consistently missing radar dots.
+0 / -0
12 months ago
Im amused by the degenerate remark.

It depends how you see it... its a long range arms race, as range comes in speed goes down as we know these units tend to be acompanied by porc creep to defend your snipers and penetrators. A fast and dynamic game starts to degenerate to a slow crawl akward stalemate which changes the game dynamic substantially. Not everyone likes what standoffs naturally change to as the arms race to do damage without taking damage changes to. Assaults are generally not enough to break through it, and too risky to deliver metal as well.

+0 / -0
12 months ago
It is my belief that at the more extreme differences of matchmaking, especially for players who know their opponent, the stats can get pretty off. I know that I am much more willing to experiment with weird facs against bad players, and that means that for higher differences you can get some weird stats.

That said, I don't actually know what difference in WHR really means. A 90-10% seems to extreme for me, I think 66-33% would be ideal (though we might not have enough data), at most perhaps 25-75.

As for what I think this means?

Cloaky is good as a generalist, but not amazing. It is exactly what everyone thinks it is.
Shield is probably too good, but is perhaps a bit map situational. It tends to have more of a skill curve than I would expect.
Rover is maybe fine.
Spider seems fine, but gets weaker and weaker as players grow more skilled. Its probably falling behind in the eco game when people are really trying.
Tank is probably a hair overtuned still, especially for weaker players.
Amph is situationally OP. Like everyone probably thought.
JJ is perhaps good for crushing people worse at the game than you, or getting surprise victories. Results seem to be unusually swingy, an cannot tell easily why, might just be a quirk of lower number statistics.

Hover is forgotten by many, just like I almost did. It is probably generally a bit weak, with situationaly good maps that experienced players can recognize and make the most of. Could probably stand to be a hair easier to use.
+0 / -0
Many, if not all, of your musings are testable. You can look at the stats and see for yourself whether the Jumpbots wins are mostly by players of significantly higher skill. You can extract per-map winrate data for Shieldbots. You can see whether Godde always wins as Spiders. Is anyone keen to make some fancy charts?

I have heard that our elo is calibrated to match the probabilities here: https://www.walkofmind.com/programming/chess/elo.htm . Before anyone says "a gap of 500 is too large" I'd like to see the spread of elo differences in the actual games played.
+0 / -0
12 months ago
How come shields have like 1/3 of cloaky's usage in high level if they are that good?
+0 / -0
