Artificial Intelligence

Interpretation of Probability, Yet Again

In my previous post, I mentioned that my "personalized "bayesian frequentist" interpretation of probability was struck down by the news that frequentism conflicts with the standard axioms of probability." The issue needs more consideration, so this post will discuss it exclusively.

To that end, I've found a very interesting article on the subject.

The information I had found previously went something like this. The simplistic frequentist position is to say that the probability of an event is the fraction of times it occurs in some set. This is problematic mainly because if we flip a coin five times and get the fractions heads=1/5 and tails=4/5, we can attribute these fractions to chance. We don't automatically believe that the "real" probability of heads is 1/5.

Revision 1 of the frequentist view changes the definition to "limiting frequency": the probability is the fraction we would get if we kept at it. The limiting frequency does not always exist. For example, consider a sequence containing As and Bs, starting with ABBAAAABBBBBBBB... Each time, the number of same letters in a row doubles. The ratio of A to B will wave back and forth forever, never settling. So by definition, probabilities only apply to sequences for which the limiting frequency exists.

This is better, but it still isn't quite right. There are two problems. First, a coin could land on heads every even flip and tails every odd flip. The limiting fraction for both sides would be 1/2, but this is obviously not a random sequence. So the requirement that there is a limiting frequency is not enough to guarantee that the sequence is probabilistic. Second, it is possible to re-order an infinite sequence to make the limiting frequencies different. With the same alternating heads/tails sequence, we could reorder as follows: group heads together in pairs by moving them backwards, but keep tails isolated. This makes the limiting frequency of heads 2/3. It seems odd that the probability would change when we're just counting in a different order.

To fix this, von Mises defined something called a "collective". Before reading the paper, I knew that a collective had the additional property that any subsequence chosen without knowledge of where the heads were and where the tails were would have the same limit frequencies. I had also read that the resulting theory was inconsistant with the standard axioms of probability. I wondered: if this definition is inconsistant with the standard axiomization, what sort of alternative probability theory does it yield?

What the paper immediately revealed was that the collective-based definition of probability was a competitor to the now-standard axiomization, Kolmogorov's axiomization. It is not surprising, then, that the two are inconsistant with eachother. Where the two differ, von Mises preferred the collection-based account. It is not hard to see why; his account is grounded in a mathematical concept of a random sequence, while Kolmogorov's axioms simply tell us how probabilities must be calculated, without any explanation of what a probability means.

Mainly, the notion of a collective is weaker; it does not allow us to prove as much. For example, it is quite possible that a coin being flipped approaches the correct frequency "from above": with only a finite number of exceptions, the ratio of heads at any finite time can be greater than 1/2, although it gets to 1/2 as we keep going. Perhaps a stonger notion of random sequence is needed, then. But I do think the von Mises approach of defining random sequences first, and then probabilities, seems like a better foundation. I wonder: is there any definition of randomness from which the Kolmogorov axioms automatically follow?

Artificial Intelligence

Tuesday, June 10, 2008

No comments:

Post a Comment

About Me

Followers

Blog Archive