I think much of the supposed controversy over "Bayesian" approaches comes from the rhetoric of the Bayesians themselves, and could be smoothed by a better understanding of what they are really saying. First, Bayes' Theorem itself is a simple direct consequence of the normal "frequentist" view of probability, not a new definition at all.
In both cases, it makes no sense to talk about the "probability" of
an event except with respect to some set of possible events to which
the event in question belongs. Let's start with a simple example:
A bag contains 10 balls; 8 red, 2 white. We are to randomly select
a ball from the bag a number of times and bet on the outcome. The
universe of events we are concerned with is the set [draw ball 1,
draw ball 2, ... draw ball 10]. The subset we wish to know the
probability of is [draw ball 1, draw ball 2] (where balls 1 and 2
are the white ones). The proportion of the event in question
(drawing a white ball) to the universe of possible events is 1 in 5.
What this means is that it would be rational for us to bet $4 on
drawing a red ball against an opponent betting $1 on a white one.
Now let's go up a level. Let's suppose we don't know the actual
distribution, bet we've drawn a ball from the bag 20 times and
recorded the outcomes. We are aksed to bet on the _distributions_
now given the outcomes. This is an exercise in algebra: we have
a formula for a quantity "P" that predicts outcomes given a fixed
distribution; how do we finagle that formula into predicting
distributions given outcomes? Enter Bayes' Theorem. We just treat
distributions as events from a larger set of possible distributions
(we must still specify this superset in order for the formula to
work), and apply normal frequentist conditional-probability math to
derive a number "B" which we will treat as the likelihood of a
distribution. Since "B" is a probability calculated the same way
as "P", we use it much the same way, in that we can make bets on it
and expect to do better than anyone using some other method.
To calculate "P", we needed to know the size of the event set and the size of the one event in question. Similarly, to calculate "B" we need to know the size of the set of possible distributions and the relative size of each distribution in question. Same idea, same forumulas, different point of view and interpretation.
Now it occurs to the Bayesians that there are lots of times in life where we see a number of outcomes but do not have any access to the underlying distributions from which they come. A Bayesian simply assumes that this isn't a problem and uses the formula to derive the likely distributions. Going even further, we need not assume that such a distributions exists at all, but use "B" as a measure of our knowledge. What's interesting here are just the problems to which this math is applied and the interpretation of the results. It's really just a semantic/philosophical divide like the various interpretations of quantum mechanics, not any real disagreement on the underlying reality of anything.
-- Lee Daniel Crocker <lee@piclab.com> <http://www.piclab.com/lcrocker.html> "All inventions or works of authorship original to me, herein and past, are placed irrevocably in the public domain, and may be used or modified for any purpose, without permission, attribution, or notification."--LDC