Re: Parallel Universes

From: Eliezer S. Yudkowsky (sentience@pobox.com)
Date: Wed Feb 12 2003 - 00:10:41 MST

  • Next message: Damien Broderick: "RE: skyhooks again"

    Wei Dai wrote:
    > On Tue, Feb 11, 2003 at 11:13:32PM -0500, Eliezer S. Yudkowsky wrote:
    >
    >>If he chooses to apply Bayes's rule, then his expected *global* utility is
    >>p(1)*u(.99m rewarded) + p(1)*u(.01m punished), while his expected *local*
    >>utility is p(.99)*(1 rewarded) + p(.01)*(1 punished).
    >
    >
    > I assume an altruist-Platonist is supposed to try to maximize his global
    > utility. What role is the local utility supposed to play? BTW, the only
    > difference between your "global" utility and my utility is that I
    > subtracted out the parts of the multiverse that can't be causally affected
    > by the decision maker, so they're really equivalent. But for the argument
    > below I'll use your definition.
    >
    >
    >>If he sees X=0
    >>locally it doesn't change his estimate of the global truth that .99m
    >>observers see the true value of X and .01m observers see a false value of
    >>X. Roughly speaking, if a Bayesian altruist-Platonist sees X=0, his
    >>expected global utility is:
    >>
    >>p(.99)*u(.99m observers see 0 => .99m observers choose 0 => .99m observers
    >>are rewarded && .01m observers see 1 => .01m observers choose 1 => .01m
    >>observers are punished)
    >>+
    >>p(.01)*u(.99m observers see 1 => .99m observers choose 1 => .99m observers
    >>are rewarded && .01m observers see 0 => .01m observers choose 0 => .01m
    >>observers are punished)
    >
    > Remember in my last post I wrote:
    >
    > Note that if he did apply Bayes's rule, then his expected utility would
    > instead become .99*U(.99m people rewarded) + .01*U(.01m people punished)
    > which would weight the reward too heavily. It doesn't matter in this case
    > but would matter in other situations.

    No, Wei Dai, this does *not* happen. I see the mistake you're making but
    I honestly don't know why you're making it, which makes it hard to
    correct. If you see 1 it affects your Bayesian probability that the real
    digit is 1 or 0; it does not affect your estimate of the global outcome of
    people following a Bayesian strategy. You seem to be assuming that
    Bayesian reasoners think that all observers see the same digit... or
    something. Actually, I don't know what you're assuming. I can't make the
    math come out your way, period.

    > You're basically making exactly this mistake,

    Please actually read my numbers.

    > and I'll describe one of the
    > "other situations" where it's easy to see that it is a mistake. Suppose
    > in the thought experiment, the reward for guessing 1 if X=1 is 1000 times
    > as high as the reward for guessing 0 if X=0 (and the punishments stay the
    > same). Wouldn't you agree that in this case, you should guess 1 even if
    > the printout says 0?

    Yes.

    > The way you compute global utility, however, your
    > utility is still higher if you guess 0 when you see 0.

    No.

    > Here're the calculations. Let R stand for "observers rewarded", E stand
    > for "observers rewarded extra", i.e. rewarded 1000 times more, and P for
    > "observers punished". And let u(E) = 1000*u(R) = -1000*u(P) = 1000.
    > According to your utility function, the expected utilities of choosing 0
    > or 1 are:
    >
    > EU(choose 0)
    > .99*u(.99m R & .01m P) + .01*u(.99m E & .01m P)
    > =
    > .99*.98 + .01*(990-.01)
    > =
    > 10.8701

    Um... I just don't see where you get these numbers. Are you talking about
    the expected total utility of choosing 0 every time, or the expected
    utility of choosing 0 given that you saw 0?

    Given that you see 0, the expected utility of all reasoners choosing 0 is:

    .99*u(1.0m R) + 0.01u(1.0m P)
    =
    .99 - .01

    Given that you see 0, the expected utility of all reasoners choosing 1 is:

    .99*u(1.0m P) + 0.01u(1.0m E)
    =
    -.99 + 10

    > If you do the same computations without applying Bayes's rule first,
    > you'll see that EU(choose 1) > EU(choose 0), but I won't go through the
    > details. What's going on here is that by applying Bayes's rule, you're
    > discounting the extra reward twice, once through the probability, and
    > another time through the measure of observers, so the extra reward is
    > discounted by a factor of 10000 instead of 100. That's why I choose to
    > make the extra reward worth 1000 times the normal reward.

    You know, there are days when you want to just give up and say: "Don't
    tell ME how Bayes' Theorem works, foolish mortal." Anyway, for some
    strange reason you appear to be applying Bayes adjustments to the measure
    of observers and *and* to those observers' subjective probabilities.
    That's where the double discount is coming from in your strange
    calculations. You're applying numbers to the measure of observers that
    just don't go there.

    -- 
    Eliezer S. Yudkowsky                          http://singinst.org/
    Research Fellow, Singularity Institute for Artificial Intelligence
    


    This archive was generated by hypermail 2.1.5 : Wed Feb 12 2003 - 00:12:59 MST