1 |
[q]also I'm not sure if I'm supposed to get the mean either - but I feel like it should be the mean[/q]Yes, it's best to use the mean. It's also not wrong to use the sum instead or to use ln instead of 1+log_2. Any affine transformation with non-zero slope works. But only if you use 1+log_2, you have the nice property that always guessing 50% yields score 0. And only if you use the mean, guessing always 100% right yields score 1.
|
1 |
[q]also I'm not sure if I'm supposed to get the mean either - but I feel like it should be the mean[/q]Yes, it's best to use the mean. It's also not wrong to use the sum instead or to use ln instead of 1+log_2. Any affine transformation with non-zero slope works. But only if you use 1+log_2, you have the nice property that always guessing 50% yields score 0. And only if you use the mean, guessing always 100% right yields score 1.
|
2 |
\n
|
2 |
\n
|
3 |
Here's
another
way
to
see
that
guessing
probabilities
far
away
from
50%
must
be
punished
harder:
Unexpected
events
contain
a
much
higher
amount
of
information.
For
example,
if
you
win
against
@Godde,
that's
more
interesting
than
if
you
win
against
an
equal
player.
If
you
choose
a
complicated
password,
it
becomes
exponentially
more
unexpected.
{
{
{
probability
=
exp(
-
information)
|
3 |
Here's
another
way
to
see
that
guessing
probabilities
far
away
from
50%
must
be
punished
harder
(
using
information
theory,
like
@GoogleFrog)
:
Unexpected
events
contain
a
much
higher
amount
of
information.
For
example,
if
you
win
against
@Godde,
that's
more
interesting
than
if
you
win
against
an
equal
player.
If
you
choose
a
complicated
password,
it
becomes
exponentially
more
unexpected.
{
{
{
probability
=
exp(
-
information)
|
4 |
information = - log(probability)}}}The log scoring rule is a direct consequence from the [url=https://en.wikipedia.org/wiki/Entropy_(information_theory)]entropy of information[/url]. If something very unexpected happens, this means that a lot of information in your prediction system might be wrong. If something 100% unexpected happens, this means that all information in your prediction system is wrong. This is the logical principle [url=https://en.wikipedia.org/wiki/Principle_of_explosion]ex falso quodlibet[/url]: If you believe only one wrong statement, you can use that to prove logically that every statement is true and false at the same time. Hence this yields score minus infinity. So choose your religion carefully.
|
4 |
information = - log(probability)}}}The log scoring rule is a direct consequence from the [url=https://en.wikipedia.org/wiki/Entropy_(information_theory)]entropy of information[/url]. If something very unexpected happens, this means that a lot of information in your prediction system might be wrong. If something 100% unexpected happens, this means that all information in your prediction system is wrong. This is the logical principle [url=https://en.wikipedia.org/wiki/Principle_of_explosion]ex falso quodlibet[/url]: If you believe only one wrong statement, you can use that to prove logically that every statement is true and false at the same time. Hence this yields score minus infinity. So choose your religion carefully.
|
5 |
\n
|
5 |
\n
|
6 |
[q]I'll do a new post about my experiments in the future. Don't worry, I'm not going to argue it's better than WHR - I'm quite sure it isn't - just looking if I can see any patterns in the data :) [/q]Feel free to try. It would be a great achievement to find a better system than WHR. Like @Aquanim, I'd rather expect small improvements from details about the WHR implementation. Finding a better system that is very different from the established systems like Glicko or WHR is very unlikely imo - but I'm not saying 0% because this might give me a very bad score ;).
|
6 |
[q]I'll do a new post about my experiments in the future. Don't worry, I'm not going to argue it's better than WHR - I'm quite sure it isn't - just looking if I can see any patterns in the data :) [/q]Feel free to try. It would be a great achievement to find a better system than WHR. Like @Aquanim, I'd rather expect small improvements from details about the WHR implementation. Finding a better system that is very different from the established systems like Glicko or WHR is very unlikely imo - but I'm not saying 0% because this might give me a very bad score ;).
|