Re: Parallel Universes

From: Eliezer S. Yudkowsky (
Date: Wed Feb 12 2003 - 00:10:41 MST

  • Next message: Damien Broderick: "RE: skyhooks again"

    Wei Dai wrote:
    > On Tue, Feb 11, 2003 at 11:13:32PM -0500, Eliezer S. Yudkowsky wrote:
    >>If he chooses to apply Bayes's rule, then his expected *global* utility is
    >>p(1)*u(.99m rewarded) + p(1)*u(.01m punished), while his expected *local*
    >>utility is p(.99)*(1 rewarded) + p(.01)*(1 punished).
    > I assume an altruist-Platonist is supposed to try to maximize his global
    > utility. What role is the local utility supposed to play? BTW, the only
    > difference between your "global" utility and my utility is that I
    > subtracted out the parts of the multiverse that can't be causally affected
    > by the decision maker, so they're really equivalent. But for the argument
    > below I'll use your definition.
    >>If he sees X=0
    >>locally it doesn't change his estimate of the global truth that .99m
    >>observers see the true value of X and .01m observers see a false value of
    >>X. Roughly speaking, if a Bayesian altruist-Platonist sees X=0, his
    >>expected global utility is:
    >>p(.99)*u(.99m observers see 0 => .99m observers choose 0 => .99m observers
    >>are rewarded && .01m observers see 1 => .01m observers choose 1 => .01m
    >>observers are punished)
    >>p(.01)*u(.99m observers see 1 => .99m observers choose 1 => .99m observers
    >>are rewarded && .01m observers see 0 => .01m observers choose 0 => .01m
    >>observers are punished)
    > Remember in my last post I wrote:
    > Note that if he did apply Bayes's rule, then his expected utility would
    > instead become .99*U(.99m people rewarded) + .01*U(.01m people punished)
    > which would weight the reward too heavily. It doesn't matter in this case
    > but would matter in other situations.

    No, Wei Dai, this does *not* happen. I see the mistake you're making but
    I honestly don't know why you're making it, which makes it hard to
    correct. If you see 1 it affects your Bayesian probability that the real
    digit is 1 or 0; it does not affect your estimate of the global outcome of
    people following a Bayesian strategy. You seem to be assuming that
    Bayesian reasoners think that all observers see the same digit... or
    something. Actually, I don't know what you're assuming. I can't make the
    math come out your way, period.

    > You're basically making exactly this mistake,

    Please actually read my numbers.

    > and I'll describe one of the
    > "other situations" where it's easy to see that it is a mistake. Suppose
    > in the thought experiment, the reward for guessing 1 if X=1 is 1000 times
    > as high as the reward for guessing 0 if X=0 (and the punishments stay the
    > same). Wouldn't you agree that in this case, you should guess 1 even if
    > the printout says 0?


    > The way you compute global utility, however, your
    > utility is still higher if you guess 0 when you see 0.


    > Here're the calculations. Let R stand for "observers rewarded", E stand
    > for "observers rewarded extra", i.e. rewarded 1000 times more, and P for
    > "observers punished". And let u(E) = 1000*u(R) = -1000*u(P) = 1000.
    > According to your utility function, the expected utilities of choosing 0
    > or 1 are:
    > EU(choose 0)
    > .99*u(.99m R & .01m P) + .01*u(.99m E & .01m P)
    > =
    > .99*.98 + .01*(990-.01)
    > =
    > 10.8701

    Um... I just don't see where you get these numbers. Are you talking about
    the expected total utility of choosing 0 every time, or the expected
    utility of choosing 0 given that you saw 0?

    Given that you see 0, the expected utility of all reasoners choosing 0 is:

    .99*u(1.0m R) + 0.01u(1.0m P)
    .99 - .01

    Given that you see 0, the expected utility of all reasoners choosing 1 is:

    .99*u(1.0m P) + 0.01u(1.0m E)
    -.99 + 10

    > If you do the same computations without applying Bayes's rule first,
    > you'll see that EU(choose 1) > EU(choose 0), but I won't go through the
    > details. What's going on here is that by applying Bayes's rule, you're
    > discounting the extra reward twice, once through the probability, and
    > another time through the measure of observers, so the extra reward is
    > discounted by a factor of 10000 instead of 100. That's why I choose to
    > make the extra reward worth 1000 times the normal reward.

    You know, there are days when you want to just give up and say: "Don't
    tell ME how Bayes' Theorem works, foolish mortal." Anyway, for some
    strange reason you appear to be applying Bayes adjustments to the measure
    of observers and *and* to those observers' subjective probabilities.
    That's where the double discount is coming from in your strange
    calculations. You're applying numbers to the measure of observers that
    just don't go there.

    Eliezer S. Yudkowsky                
    Research Fellow, Singularity Institute for Artificial Intelligence

    This archive was generated by hypermail 2.1.5 : Wed Feb 12 2003 - 00:12:59 MST