| 1 | [q]also I'm not sure if I'm supposed to get the mean either - but I feel like it should be the mean[/q]Yes, it's best to use the mean. It's also not wrong to use the sum instead or to use ln instead of 1+log_2. Any affine transformation with non-zero slope works. But only if you use 1+log_2, you have the nice property that always guessing 50% yields score 0. And only if you use the mean, guessing always 100% right yields score 1. | 1 | [q]also I'm not sure if I'm supposed to get the mean either - but I feel like it should be the mean[/q]Yes, it's best to use the mean. It's also not wrong to use the sum instead or to use ln instead of 1+log_2. Any affine transformation with non-zero slope works. But only if you use 1+log_2, you have the nice property that always guessing 50% yields score 0. And only if you use the mean, guessing always 100% right yields score 1. | 
                
                    | 2 | \n | 2 | \n | 
                
                    | 3 | Here's 
            another 
            way 
            to 
            see 
            that 
            guessing 
            probabilities 
            far 
            away 
            from 
            50% 
            must 
            be 
            punished 
            harder: 
            Unexpected 
            events 
            contain 
            a 
            much 
            higher 
            amount 
            of 
            information.
             
            For 
            example,
             
            if 
            you 
            win 
            against 
            @Godde,
             
            that's 
            more 
            interesting 
            than 
            if 
            you 
            win 
            against 
            an 
            equal 
            player.
             
            If 
            you 
            choose 
            a 
            complicated 
            password,
             
            it 
            becomes 
            exponentially 
            more 
            unexpected.
            {
            {
            {
            probability 
            = 
            exp(
            - 
            information) | 3 | Here's 
            another 
            way 
            to 
            see 
            that 
            guessing 
            probabilities 
            far 
            away 
            from 
            50% 
            must 
            be 
            punished 
            harder 
            (
            using 
            information 
            theory,
             
            like 
            @GoogleFrog)
            : 
            Unexpected 
            events 
            contain 
            a 
            much 
            higher 
            amount 
            of 
            information.
             
            For 
            example,
             
            if 
            you 
            win 
            against 
            @Godde,
             
            that's 
            more 
            interesting 
            than 
            if 
            you 
            win 
            against 
            an 
            equal 
            player.
             
            If 
            you 
            choose 
            a 
            complicated 
            password,
             
            it 
            becomes 
            exponentially 
            more 
            unexpected.
            {
            {
            {
            probability 
            = 
            exp(
            - 
            information) | 
                
                    | 4 | information = - log(probability)}}}The log scoring rule is a direct consequence from the [url=https://en.wikipedia.org/wiki/Entropy_(information_theory)]entropy of information[/url]. If something very unexpected happens, this means that a lot of information in your prediction system might be wrong. If something 100% unexpected happens, this means that all information in your prediction system is wrong. This is the logical principle [url=https://en.wikipedia.org/wiki/Principle_of_explosion]ex falso quodlibet[/url]: If you believe only one wrong statement, you can use that to prove logically that every statement is true and false at the same time. Hence this yields score minus infinity. So choose your religion carefully. | 4 | information = - log(probability)}}}The log scoring rule is a direct consequence from the [url=https://en.wikipedia.org/wiki/Entropy_(information_theory)]entropy of information[/url]. If something very unexpected happens, this means that a lot of information in your prediction system might be wrong. If something 100% unexpected happens, this means that all information in your prediction system is wrong. This is the logical principle [url=https://en.wikipedia.org/wiki/Principle_of_explosion]ex falso quodlibet[/url]: If you believe only one wrong statement, you can use that to prove logically that every statement is true and false at the same time. Hence this yields score minus infinity. So choose your religion carefully. | 
                
                    | 5 | \n | 5 | \n | 
                
                    | 6 | [q]I'll do a new post about my experiments in the future. Don't worry, I'm not going to argue it's better than WHR - I'm quite sure it isn't - just looking if I can see any patterns in the data :) [/q]Feel free to try. It would be a great achievement to find a better system than WHR. Like @Aquanim, I'd rather expect small improvements from details about the WHR implementation. Finding a better system that is very different from the established systems like Glicko or WHR is very unlikely imo - but I'm not saying 0% because this might give me a very bad score ;). | 6 | [q]I'll do a new post about my experiments in the future. Don't worry, I'm not going to argue it's better than WHR - I'm quite sure it isn't - just looking if I can see any patterns in the data :) [/q]Feel free to try. It would be a great achievement to find a better system than WHR. Like @Aquanim, I'd rather expect small improvements from details about the WHR implementation. Finding a better system that is very different from the established systems like Glicko or WHR is very unlikely imo - but I'm not saying 0% because this might give me a very bad score ;). |