At various points in time I have felt that I was missing data to answer some questions regarding ZK players and games. Things such as "what is most played game type", "are there players that resign more than average", "what awards are collected in 1v1 vs all welcome", "how would multiple ladders look" and others. As implementing anything on the server is a very intense project (getting to know the infrastructure, develop, upload, do not break anything) I felt a (crazy) alternative solution might be better: do everything in your browser based on text files with the data! I had no clue if such a thing will work (not an expert in browser stuff but knew enough) so thought to just give it a try. I present here the result (it does work!), comparing hour players spent in game and number of game by player age. WARNING 1: this is work in progress and there might be errors in it (hence the beta in the title) WARNING 2: the raw data (battle, maps, etc.) is currently 300MB. The first time you load one of the links below, it will be long (depends on your internet connection speed) and might look like nothing happens. Be patient. If you use the same link later the data will be loaded from cache and it will be much faster The link to latest release is Zero-K Local Analysis. The question mark icons have additional info (I know who reads help but thought to mention it). The plan is to enhance it with all the things I am curious about. Also probably can make it faster. At some point I would like to make it open source and share it, but now I am more curious about the questions. Would like to hear feedback/opinions/questions... Releases:
-
release2023-06-11-02
-
added visualization of active and new players
-
improved date range such that will prompt to load more data if larger range is selected
-
graphs should now be linkable (to copy the address from address bar)
-
uploaded a sqlite database with the damages reported in each replay (not yet visualized)
-
release2023-04-30-01
-
reprocessed all data and fixed several issues that resulted in some battles not being counted.
-
added option (see settings up right) to load only parts of data (default 3 months) to speed up loading
-
enabled serving compressed gzip (should help with download speed) data size reduced from 385 MB to 60 MB.
-
added player resign statistics page
-
improved explanations for some of the graphs
-
made the filters keep same values across pages to make it easier to compare
-
the mod selected by default is Zero-K (the others are available but must be selected)
-
[release2022-07-17-01
-
added new game type (other PvP - ffa, 1v1 outside matchmaking) - thanks Dave[tB] for noticing some games are missing
-
updating background loaded data happens rarer to improve interface responsivity during loading
-
show the last date scrapping was done, for clarity
-
release2022-07-07-01
-
made data loading a background task.
-
automated new data uploading (daily) so I don't need to do anything
-
added player nicknames in clear. This will be updated regularly (but not often)
-
added player award page (shows number or normalized number of awards per player, with filters)
-
release2022-02-22-01
-
added player win/loss/ratio for maps, teammates and enemies
-
fix issue when some times where not counted (shown values in game modes played where lower than they really were)
-
release2022-02-13-01
-
significant performance improvement (between 20x and 70x faster)
-
added more periods to game counts (1,3 and 5 years)
-
added button to select/deselect all games
-
release2022-02-08-01
-
slightly improved performance
-
fixed start date to be exact (weekends are seen when granularity is set to days, no more artifacts in particular months)
+7 / -0
|
you're not kidding, it lags super hard when you try and build new graphs. It's very interesting, you can't tell what days are the weekend based on the graph of total play hours. I would have assumed you could have.
+0 / -0
|
The problem is that on the website the start date is not exact for old battles: it just says - for example - 2 months ago. So I interpolate equally. For newer battles will capture it better because i do it now weekly... A database export would solve this of course. For performance there could be optimizations possible did not check in detail.
+1 / -0
|
Hot hack: you can scrap a date and time to the second for every replay based on the file name of the replay. I think it's calculated at the end of the battle though, so this time/date is the end of the battle, not when it's started. 2022/01/29 17h49m02s Shame it doesn't display like forum posts though
+1 / -0
|
Great find! Indeed that will make time completely accurate. Will reprocess my files and re-upload the data. Although something is strange. If it is the end then battle at http://zero-k.info/Battles/Detail/1297341 which has 20220205_221351 and 32 minutes long, which I interpret as 22:13:51 means it would have started around 21:45. But then there is http://zero-k.info/Battles/Detail/1297313 which has 20220205_220448 and 0 minutes long, which puts it in the "middle" of the other one. Need to check more (meaning join a game and look at the clock :-p)... I interpreted the number in replay as hh:mm:ss due to things like http://zero-k.info/Battles/Detail/1296654 that have 20220205_043540, no use for that leading 0 unless it's hh On performance just realized that I was keeping battles with duration 0 - currently there are 13% of the total. Anybody knows what a duration 0 means? I would assume a crash would be duration 0, but 13% of all games seems a lot...
+1 / -0
|
Not fact checking at the moment, but I believe'game duration' is the duration between !Start and final exit after post game. So if I'm playing with bots and pause the game over night, or adjust game speed, duration is going to become even less representative of how long the game was. I don't think it's worth trying to apply any math to that time and date for the sake of 'accuracy'. I think you should ping esainane for more information if you want to sneak a look into the replay file.
+1 / -0
|
Just tested and that string in the replay is the start time in UTC. Now that I have that data maybe it would be interesting to make some representation on most played hours or similar, so while not essential I prefer to have it as good as possible. Considering I am doing this work, you can understand why I would think looking at the end stats graph is part of the game ;-). By looking at the data so far, I would really not have expected so much time spent with bots. In the last year, twice as much time is spent in (bots+chicken) than team games.
+1 / -0
|
Anarchid told me a week or two ago that he could get me a dump of the db tables that store each game's info. It wouldn't have info like starting factory, damage done, etc, but it would just say who played in what game and who won/lost. Haven't heard from him in a while, so I just poked him. Fingers crossed.
+1 / -0
|
|
I'll have a look at this later this coming week, thanks! (nitpick, but firefox is yelling at me because of it -- looks like you don't have https setup for your site!)
+0 / -0
|
It's good to be paranoid (on Firefox complaining website don't have HTTPS) but will not put the effort into checking how you can do that for now - the S3 static website was the simplest solution to put something out there, and does not support HTTPS...
+0 / -0
|
chaplol: files with correct start times are now at http://zerok-local-analysis.s3-website.eu-west-3.amazonaws.com/release2022-02-08-01/ (just replace that part of the link from previous post). Note that I did not fetch yet all games (scrapping is still in progress, reached 2019). Steel_Blue: I checked a bit and the library I use to manipulate data, while very elegantly implemented is horrible for performance. I think there would be a huge improvement if fixed, but it's not a short term task. Now you can also see weekends if you select day granularity - thanks again for the suggestion with the replay!
+1 / -0
|
Could the account age buckets include some above 6 months? eg. <1 year? <3 years?
+1 / -0
|
Sure. Will do something 1,3,5 and all. On the performance side, tests on some parts of the code showed 10x performance improvement with re-writes, so next update might take longer but should be much faster.
+1 / -0
|
New release at release2022-02-13-01
-
significant performance improvement (between 20x and 70x faster)
-
added more periods to game counts (1,3 and 5 years)
-
added button to select/deselect all games
Interesting graph: for all games in 2021 only accounts older than 5 year play overall more team games than they play with bots. As a (mostly) team game player I did not realize how much people play with bots.
+2 / -0
|
New release at release2022-02-22-01
-
added player win/loss/ratio for maps, teammates and enemies
-
fix issue when some times where not counted (shown values in game modes played where lower than they really were)
Note: loading time is still 2-3 minutes without a progress bar Note: due to the previous discussions about players wanting to delete their accounts I choose not to store user names on the analysis website. I could "retrieve" them live from zero-k, but that would need a specific setting in zero-k.info configuration (CORS setting), which involves some additional effort... New analysis added: ratio of wins/loses for maps, players in the same team, players in the enemy team. As examples (for ~ last 2 years): checked Godde for casual and you can see he wins 75% on Field of Isis. The ally player with whom he wins the most (66%) is Znack (id 206848) while the player he kills most often (80%) as an enemy is @Bonke (id 436702). For 1v1 @Godde is as you expect very hard to beat. With an overall ratio of 84%, most played maps all above 80%. Player giving him most "headaches" is PRO_rANDY (id 85949) against whom he has a ratio of "only" 66% Note: obviously if you play 1v1 there are no "teammates" so middle table will be empty. Have fun checking on which maps you win the most and who are the worst to have as enemies or as teammates! For me one of the maps I win the most is Tangerine, which is also one of maps I like the most (coincidence...?)
+2 / -0
|
New release at release2022-07-07-01
-
made data loading a background task. This allows playing with graphs while complete data is loading (complete data is 300 MB). Data should be cached locally by browser, so second load of page will be much faster
-
automated new data uploading (daily) so I don't need to do anything
-
added player nicknames in clear. This will be updated regularly (but not often) to cope with people changing/deleteting accounts
-
added player award page (shows number or normalized number of awards per player, with filters)
As examples (for last year more or less): for ZeroK, for casual no bots, with at least 300 games played, these are the awards obtained by players. As I was curious about Airforce General sorted by that one. We can see Sab and Jummy really like dealing damage with air. What if we wonder about number of games when they get Airforce General award from the games they played? We click normalize, and we see that now TechAUmNu and @Inde_Irae have high ratios (still Sab comes 3rd and Jummy 4th not much suprise here) What about the recurring shout "ZK is dieing!! (because of X or Y balance change that I hate!)". Making a graph over all period, starting from March of each year (to try to separate clearly the first year of covid), and considering playtime as metric (so not number of players), it seems ZK is doing quite well. Although the bot games have a decrease in 2021 vs 2020, 2020 was "the covid year", so if we compare to 2021 with 2019 trend still is going up.
+2 / -0
|
Very Cool! A quick glance at my own stats tells me that there are many games missing -- only having 10 games on titan duel is not right. Would be amazing the player statistics could be filtered by game type -- 1v1 MM vs team, etc.
+0 / -0
|
Malric, interested in adding some function for damage done from one unit-type to another unit-type per-game? I recently figured out how to pull those stats from all replays and I can throw them into a ginormous csv if you're interested. Gonna be tens of millions of lines, but it's probably decently compressible.
+1 / -0
|
Dave[tB]: default period is last month (which limits the number of games). For the whole period (since 2010) in "Player Correlation" you have 42 games on TitanDual 2.2, which is still less than the 83 games with duration larger than 0 seconds that I see in the replays. I will investigate, thanks for pointing this out! You can filter by "1v1 MM vs team", just click "Game type" button. Then if you want to compare (for example stats for 1v1 versus MM you can always add another graph by clicking "Add new" button (you will need a big monitor though!). chaplol: might be an idea, questions:
-
do you process replays constantly by downloading them from zerok? AFAIK replays stay only 1 week then are deleted (or something along those lines)
-
how intensive is the extraction process? Do you run complete replay or is just some trick?
+1 / -0
|