Yeah, they'd just be downloaded. I think you're right that they get deleted at some point, but the retention looks like it's at least two months. Stats data can be extracted using `strings`, but someone wrote a python library to do it a bit more cleanly than that. This command downloads all replay files from 1354664 through 1402664 and makes a CSV of the stats (...with a head -1 just to show it for the first): quote: (printf "replay_url,damager,damagee,amt,emp" && seq 1354664 1402664 | head -1 | xargs -I{} wget 'https://zero-k.info/Battles/Detail/{}' -O - | grep "Manual download" | awk -F"'" '{print "https://zero-k.info"$2}' | xargs -I{} bash -c "wget '{}' -O - | gunzip | strings | grep stats,dmg | sed 's|^SPRINGIE:stats,dmg,|{},|'") > zkstats.csv |
quote: $ head zkstats.csv replay_url,damager,damagee,amt,emphttps://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,bomberdisarm,spideremp,0,0 https://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,bomberdisarm,spiderskirm,0,0 https://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,dynassault1,staticmex,1200,0 https://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,dynassault1,hoverskirm,1451.28198,0 https://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,dynassault1,turretheavylaser,32,0 https://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,dynassault1,hoverheavyraid,360,0 https://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,dynassault1,hovercon,1612.12598,0 https://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,turretheavylaser,hoverassault,2130.68848,0 https://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,turretheavylaser,hoverheavyraid,493.348907,0 https://zero-k.info/replays/20220429_070633_Cobalt Dream v1_105.1.1-841-g099e9d0 BAR105.sdfz,turretlaser,spideremp,1400.42334,0 |
This library parses replay files more-cleanly and pulls other useful stats: https://github.com/dansan/spring-replay-site/blob/master/srs/parse_demo_file.py
+1 / -0
|
chaplol: just downloading them and running strings is my little server would have no issue with. Thanks for pointing this out had no idea stats of unit type vs unit type were so easily accessible! Would be a great addition to stats! Will start saving the stats for now and see later how I can show them (and how large they are overall). The python library looks good, but I probably already have all the other information (I looked briefly and that seem to be:teams, awards, winners - is there something I miss?) so not much benefit to use it. What would be your first idea for the visualization of damage? Most of my ideas so far where towards characterizing players, this stats could also be used for balancing as well. My first idea would be a matrix of factories, and for each cell, number of games in which factory on row dealt more damage to factory on column. And would be nice to have more filters like "consider only games with this factories in", "consider only games that have only this factories in", etc. PS: do be pedantic, you miss a "\n" in the printf ;-)
+1 / -0
|
quote: PS: do be pedantic, you miss a "\n" in the printf ;-) | I love it -- good catch! Just checked the size for the 2k games I ran it on -- it's, uh. 2GB uncompressed and 200MB when compressed. A large of that comes from the fact that it includes the sdfz filename in the line, but you can swap that out for the game id or just regex the map name out and use the timestamp as a primary key of sorts. "stats,dmg" can also be removed. Getting rid of that extra info brought the 2GB down to 760MB. So it's hilariously gonna be a few dozen GB. Sadly, the damage stats aren't broken down by player, so it's only helpful to a point! But definitely check out what else is in the replay files' stats besides dmg (the python library is easy to read to figure that all out). Your visualization ideas sound great as a first start.
+1 / -0
|
Started getting them and reduce a bit the size, some comments:
-
kept only damager,damagee,amt,emp
-
removed the numbers after the decimal dot
-
the stats are output the same for each player, doing a "sort|uniq" reduced a lot the size
With this, on ~6k games I get 5k uncompressed data per game. Still large, but better. This would be 20GB for all games (theoretically, I know we don't have them anymore). Probably would just allow loading of this data for some period of time and not for all time. Things that I can further do:
-
(as you mentioned) encode names
-
remove emp column value unless not 0
Anyhow, now that I have this data, will focus on other things (like the potential bug Dave[tB] pointed out).
+2 / -0
|
New release at release2022-07-17-01Changes: * added new game type (other PvP - ffa, 1v1 outside matchmaking) - thanks Dave[tB] for noticing some games are missing * updating background loaded data happens rarer to improve interface responsivity during loading * show the last date scrapping was done, for clarity
+1 / -0
|
Now that I've played around with this a bit more, this is fantastic! I was thinking about doing a similar project to track win-rates for individual player, and factory matchups (by player) etc. Would you consider adding the following information: (In the Same way that you have maps added) -Starting Factory -Starting Commander Also -- Endgame stats, such as excess etc. Split Casual 1v1 and FFA into separate categories >> I also noticed that you don't seem to track games on the Pro 1v1 host as a separate category. Ideally, these would go under the same umbrella as 1v1 MM, as they use MM elo.
+0 / -0
|
One overall comment: at this point I am getting the information in two ways:
-
scrapping the website (so, battles, players, winners, awards, etc.)
-
only for couple of months: parsing the replay and get overall damage of some unit types against other unit types. I will add some stats on this, but it's lower priority as it is rather coarse (it is not per team) so insights will not be as good.
quote: (In the Same way that you have maps added) |
I assume you mean in "Player Correlation" to have another table with. quote: Endgame stats, such as excess etc. |
I do not have access to any of that information now. I am not aware of a simple way (other than simulating the whole replay) to obtain it from the replay file either. If no such way exists today, one possibility would be to make a gadget that dumps a string in the replay with the required information, like "End Graph player id 123, metal extracted 10,20,40,100; damage dealt 0, 10, 40, 50; etc.". This would make extraction easy. So, if you (or someone) makes the gadget and the gadget makes it in the official release, I will add additional extraction and graphs/information (although some might require different representations) quote: Split Casual 1v1 and FFA into separate categories |
Finally something that I can do :-). I guess that FFA is any game: without bots and with more than 3 teams? quote: I also noticed that you don't seem to track games on the Pro 1v1 host as a separate category. Ideally, these would go under the same umbrella as 1v1 MM, as they use MM elo. |
Sure can do that as well, should I just check on game title to be "[A] Pro 1v1 Host "? I do not play much 1v1 so might miss some of these things...
+1 / -0
|
quote: Sure can do that as well, should I just check on game title to be "[A] Pro 1v1 Host "? I do not play much 1v1 so might miss some of these things... |
Yeah, sounds good. On the other stuff, esainane's zkstats has access to logs that have the other information -- including factory which already shows there, and commander type which does get put into the log but not read. But yes, these do require running the replays.
+0 / -0
|
New release at release2023-04-30-01
-
reprocessed all data and fixed several issues that resulted in some battles not being counted.
-
added option (see settings up right) to load only parts of data (default 3 months) to speed up loading
-
enabled serving compressed gzip (should help with download speed) data size reduced from 385 MB to 60 MB.
-
added player resign statistics page
-
improved explanations for some of the graphs
-
made the filters keep same values across pages to make it easier to compare
-
the mod selected by default is Zero-K (the others are available but must be selected)
-
split Casual 1v1 and FFA into separate categories (Dave[tB] request)
As an example of last screen added, for the first 3 months of the year, for ZeroK, for casual no bots, with at least 365 games played, there are 3 players when resigned the team never won. We can also check who resigned least, or how long they played until resigning (last columns) Feedback, bug reports and ideas are always welcome!
+3 / -0
|
I ONLY GIVE UP WHEN I CAN NOT FIGHT ! TOO MANY TIMES THE TEAM RESIGNS WHEN WE CAN STILL FIGHT... ITS LIKE THEY GET PUNCHED IN THE FACE... AND IT HURTS... BUT THEY ARE NOT KO'D. AND THEY ARE LIKE, F*CK THAT. THIS FIGHTING STUFF HURTS AND I CAN'T WIN ?!?!??! ALL YOU HAVE TO DO IS FIIIGHT ! AND KEEP FIGHTING UNTIL YOU CAN NOT FIGHT ! GG WP #PURPLEDIIINS ! PS... I MISS JUMMY FRIEND =[
+2 / -0
|
Jummy is playing... 1 + 1 = Jummy, or Ive been missing some huge info for years
+0 / -0
|
I have uploaded the data about damages between different unit types (stored in the replays, extracted as chaplol suggested), at http://zerok-local-analysis.s3-website.eu-west-3.amazonaws.com/data_version2/zk_damage.db.gz The damage database updates once a week only (it is 800MB compressed currently). You can use the new sqlite3 database using https://www.sqlitetutorial.net/sqlite-attach-database/If you do anything with it, I would appreciate if you would share here the result.
+0 / -0
|
Neat, it's nice to see the SPRINGIE echos used. Are you developing a web UI for it? That was the original use of the endgame stats and it was some help for balancing. I don't know what to do with the gz.
+0 / -0
|
For clarification: the .gz is a sqlite data of the damages reported between units per battle. This can be used with the .gz of the sqlite database of all the games (at http://zerok-local-analysis.s3-website.eu-west-3.amazonaws.com/data_version2/zk.db.gz ). I can consider making some UI, but I would need some idea on what to represent. I collected them as chaplol suggested and it was rather easy (I started the data collection several months ago, but did not get to process them). The idea of ZKLA was to do the representation only on the client side (load all data in JS from a static json). While this is possible for battles/awards (for the full 10 years), for the damage it might get a bit too much (don't know did not test), so probably will need to do some summarization. If anybody wants to do locally some visualization this would make it much easier, so hope it helps someone.
+0 / -0
|
New release at release2023-06-11-02
As an example of last screen added, for the last 13 years for ZeroK, for casual no bots, for a period of one month, these are the number of active players and new players:
+2 / -0
|