I actively disbelieve that the current handling of uneven teams by the balancer is correct.
[Spoiler]I don't even believe terribly strongly that the handling of even teams is correct. As far as I understand, Whole History Rating itself is backed by sound mathematics... for 1v1 games. Our adaptation for team games assumes that the skill of a team is best represented by the average of player ratings, and I don't have total confidence in that assumption.
That having been said, I believe that any change to the balancer system is going to need to be backed up by concrete statistics. The effect which uneven teams has on any particular game is strongly influenced by the team sizes and the map, so personal preferences on those points are liable to bias anecdotal evidence quite a bit.
In the past I think there have been datasets floating around which somebody could do those statistics on. I think the only place that the rating of each player
at the time the game was played is available is within the replay files themselves, so some amount of replay crawling would be required to reproduce such a dataset.
edit: this thread is quite old and i feel like there are more recent ones, but a place to start looking:
https://zero-k.info/Forum/Thread/22898this thread doesnt have data, but just to say we've been here before:
https://zero-k.info/Forum/Thread/36816