From: Wei Dai (weidai@weidai.com)
Date: Tue Feb 11 2003 - 23:44:17 MST
On Tue, Feb 11, 2003 at 11:13:32PM -0500, Eliezer S. Yudkowsky wrote:
> If he chooses to apply Bayes's rule, then his expected *global* utility is
> p(1)*u(.99m rewarded) + p(1)*u(.01m punished), while his expected *local*
> utility is p(.99)*(1 rewarded) + p(.01)*(1 punished).
I assume an altruist-Platonist is supposed to try to maximize his global
utility. What role is the local utility supposed to play? BTW, the only
difference between your "global" utility and my utility is that I
subtracted out the parts of the multiverse that can't be causally affected
by the decision maker, so they're really equivalent. But for the argument
below I'll use your definition.
> If he sees X=0
> locally it doesn't change his estimate of the global truth that .99m
> observers see the true value of X and .01m observers see a false value of
> X. Roughly speaking, if a Bayesian altruist-Platonist sees X=0, his
> expected global utility is:
>
> p(.99)*u(.99m observers see 0 => .99m observers choose 0 => .99m observers
> are rewarded && .01m observers see 1 => .01m observers choose 1 => .01m
> observers are punished)
> +
> p(.01)*u(.99m observers see 1 => .99m observers choose 1 => .99m observers
> are rewarded && .01m observers see 0 => .01m observers choose 0 => .01m
> observers are punished)
Remember in my last post I wrote:
Note that if he did apply Bayes's rule, then his expected utility would
instead become .99*U(.99m people rewarded) + .01*U(.01m people punished)
which would weight the reward too heavily. It doesn't matter in this case
but would matter in other situations.
You're basically making exactly this mistake, and I'll describe one of the
"other situations" where it's easy to see that it is a mistake. Suppose
in the thought experiment, the reward for guessing 1 if X=1 is 1000 times
as high as the reward for guessing 0 if X=0 (and the punishments stay the
same). Wouldn't you agree that in this case, you should guess 1 even if
the printout says 0? The way you compute global utility, however, your
utility is still higher if you guess 0 when you see 0.
Here're the calculations. Let R stand for "observers rewarded", E stand
for "observers rewarded extra", i.e. rewarded 1000 times more, and P for
"observers punished". And let u(E) = 1000*u(R) = -1000*u(P) = 1000.
According to your utility function, the expected utilities of choosing 0
or 1 are:
EU(choose 0)
.99*u(.99m R & .01m P) + .01*u(.99m E & .01m P)
=
.99*.98 + .01*(990-.01)
=
10.8701
EU(choose 1)
.99*u(1m P) + .01u(1m E)
=
-.99+10
=
9.01
If you do the same computations without applying Bayes's rule first,
you'll see that EU(choose 1) > EU(choose 0), but I won't go through the
details. What's going on here is that by applying Bayes's rule, you're
discounting the extra reward twice, once through the probability, and
another time through the measure of observers, so the extra reward is
discounted by a factor of 10000 instead of 100. That's why I choose to
make the extra reward worth 1000 times the normal reward.
This archive was generated by hypermail 2.1.5 : Tue Feb 11 2003 - 23:46:34 MST