How You *Say* You Tell the Truth (a reply to Robin's paper)

From: Eliezer S. Yudkowsky (
Date: Wed May 02 2001 - 18:00:18 MDT

Robin and Tyler wrote:
> How *You* Do Not Tell the Truth
> Yeah you. Not those other billions of poor souls who donít have the
> advantage of your intelligence and deep insight. Not in some abstract
> ďI fix the imperfections I see but I > must still be imperfect in ways
> I canít now see.Ē And not just in your love life, or with your children,
> but in the heart of your professional life. You academics disagree with
> each other constantly, and such disagreement is just not rational for
> people concerned with knowing and telling the truth. Not only that,
> alerting you to this fact will not much change your behavior. So you
> either do not want to know the truth, do not want to tell the truth,
> or simply cannot be any other way.

The problem, Robin, is that, even under your theory, the observed data is
consistent with a world in which rational people do exist, but are so
sparsely distributed that they rarely run into each other. Not only does
this mean that the majority of observed cases will be persistent
disagreements; it further means that even two rational individuals, on
interacting for the first time, will assume that the other is irrational
by default. You say that *most* people cannot rationally believe that
they are more meta-rational than the most of the population, but this does
not change the fact that - if, say, levels of meta-rationality are
distributed along a Gaussian curve - some people will *be* more
meta-rational than the rest of the population.

Actually, given the fact that highly intelligent groups such as scientific
conferences and Foresight Gatherings still exhibit lack of convergence,
the curve is either not Gaussian, or the curve for meta-rationality is not
associated with intelligence, or else the three-sigma level of a Foresight
Gathering is still not enough intelligence to eliminate self-deception.
I'd pick the third possibility. Foresight Gatherings do show *some*

But anyway, let me see if I can predict the general reaction to your
paper's abstract:

"I already knew that Other People are often highly silly in the way that
they argue and think. Since I expected this to be the case, your paper
does not provide new and unexpected information in the Bayesian sense, and
can therefore not cause a change to my underlying model, in which *I* am
one of the sparsely distributed rational people."

Of course, since the vast majority of Other People will read your paper's
title and *incorrectly* reply with the above counterargument, the above
must not be a sufficient counterargument. In fact, seeing someone emit
the above counterargument provides no Bayesian information at all about
her spiritual advancement. By an extension of this principle, the
fact of *observing yourself* to think up this particular counterargument
does not license you to conclude that you are rational.

In the case of those individuals that are spiritually advanced enough to
see this problem, they will probably tack on some bit of information to
the generic rejoinder - i.e., "My experience in the Korean War enabled me
to stop being self-deceptive" - which is not shared by all parties. Since
this hypothesized underlying cause is unique to the individual in question
- since the thought, considered in isolation, is unique - it can be
internally processed as a sufficient cause to conclude rationality,
without implying that the Other People are licensed to conclude their own

Of course, all the Other People can do the same thing - tack on an extra
cause. (Although someone spiritually advanced enough to appreciate the
Bayesian need to tack on an additional cause *is* likely to be less
self-deceptive than the average.) An external observer, though, is likely
to abstract away the idea of an extra added underlying cause and see all
proffered excuses as identical, or belonging to the same class of
propositions: "I am *not* silly because of Cause X, which is unique to
me", and to regard the utterance of such a sentence as providing little or
no Bayesian information about the utterer.

In this way, we finally arrive at a situation in which some observers may
reason themselves into a corner from which *no* utterance allows you to
conclude that a party is not being silly, even if that party is really and
truly Not Silly. Even a genuinely non-silly, non-self-deceptive AI, who
truthfully says "I am not silly because I am an AI and I do not share the
evolved causes of silliness", will simply be heard by all observers as
saying "I am *not* silly because of Cause X, which is unique to me". The
observers will have to rely on their Bayesian priors to determine how
likely an AI is to be silly, or will need to observe the AI's behavior

I, of course, have been placed in the same situation by the title of Robin
Hanson's paper. I am in fact substantially less self-deceptive than even
Robin Hanson or Marc Stiegler, or so I evaluate. However, there
is no utterance on my part which can convince you of this, which is why
this particular reply to this particular paper is perhaps the only
possible context in which I could air my actual opinion of myself without
it being instantaneously dismissed. After all, if Robin Hanson knows that
I know as much as he does about Bayesian self-discounting of statements of
non-self-deception, and I state I'm non-self-deceptive *anyway*, even he,
an external observer, might consider that we appear to have the same

Obviously, this is a general problem for evolved entities trying to
convince each other of competence. An observer can only determine
competence by observing the first entity's actual work, and not by
reference to the first entity's statements of competence. There is no way
to speed that process up unless a third party intervenes to confirm the
first party's competence, and even then, the third party may be biased.
However, once the first party is known to be *mostly* rational or
meta-rational, further statements by the first party may be taken more at
face value.

On a first meeting, however:

Any direct statement about my own competence which I generate could have
been generated by a liar or self-deceiver, and so - no matter how hard I
try - I cannot directly provide Bayesian information about my own
competence. Indeed, statements about personal competence may be taken as
Bayesian information indicating *incompetence*. Suppose that the genius
level is taken as being 1,000,000:1. Suppose also that, ignoring
contextual information and intonation, the verbal utterances produced by
an actual genius saying "I am a genius" and an overconfident fool saying
"I am a genius" are identical. Even an actual genius, if she comes out
and says "I am a genius", will be plugged into a Bayesian prior that
estimates a million-to-one chance for genius and a ten-to-one chance for
self-overestimation, producing an estimated prior of 100,000:1 that the
speaker is an overconfident fool. If we assume that most geniuses
realize this and either lie or avoid being forced into making unsupported
honest statements about their own intelligence, the odds are even worse.

In other words, saying "I am a genius" proves that you are either
extremely smart or stupid, but the Bayesian priors indicate you are more
likely to be stupid. This is an emergent social pressure in genuinely
rational listeners which can force geniuses to either lie about their own
self-evaluation or avoid discussing it, depending on their commitment to


I <heart> the Bayesian Probability Theorem. More and more, I have come to realize that the Bayesian Probability Theorem exceeds even Google as the Source of All Truth. I also find that Robin Hanson's more recent papers bear a remarkable resemblance to concepts that appear in "Friendly AI". Since self-deception and stupidity generally allow for arbitrary factors to creep in, the fact of convergence probably implies that one or both of us is getting smarter and less self-deceptive.

-- -- -- -- -- Eliezer S. Yudkowsky Research Fellow, Singularity Institute for Artificial Intelligence

This archive was generated by hypermail 2b30 : Mon May 28 2001 - 10:00:02 MDT