## Saturday, June 07, 2008

Modal Probability Stuff

My personalized "bayesian frequentist" interpretation of probability was struck down by the news that frequentism conflicts with the standard axioms of probability. In the wake of this disaster, I am forced to develop some new ideas about how to interpret probabilities.

I've been thinking of using modal logic to support probability theory. I looked around the internet somewhat, and it seems like there is some action in the opposite direction (using probability theory as a foundation for modal logic), but not so much in the direction I want to go. (I assume of course that my direction isn't totally unique... it seems natural enough, so someone's probably tried it. But anyway it is in the minority.)

Modal logic is essentially the logic of possibility and necessity. Because these words are ambiguous, there are many possible modal logics. We have a number of worlds, some of which can "see" eachother. Different facts may be true and false in each world. In a given world, a fact is "necessary" if it is true in all worlds that world can see. It is "possible" if it is true in at least one of those worlds. Modal logic lets us assign definite meaning to larger stacks such as "necessarily necessary" (it is true in all worlds that the worlds we can see are able to see), "possibly necessary" (in some world we can see, it is true in all worlds that world can see), and "possibly possibly possible" (in some world we can see, in some world it can see, in some world it can see, the fact holds).

Perhaps the idea of "worlds we can see" seems ambiguous. My choice of wording is essentially arbitrary; a common choice is "accessibility". As with necessity and possibility, the exact meaning depends on the modal logic we're using. For example, we might want to talk about immediate future necessity and possibility. The worlds we can see are the possible next moments. "Possibly possible" refers to possibility two moments ahead, "possibly possibly possible" is three moments ahead, and so on. Another modal logic corresponds to possibility and necessity in the entire future. We can "see" any possible world we can evetually reach. Additionally, we might also say that we can see the present moment. (We could choose not to do this, but it is convenient.) In this modal logic, "possibly possible" amounts to the same thing as "possible"; "necessarily necessary" amounts to "necessary". However, "necessarily possible" does not collapse (since it means that a fact remains possible no matter which path we go down in the future), and neither does "possibly necessary" (which means we can take some path to make a fact necessary). Which strings of modal operators collapse in a given modal logic is one of the basic questions to ask.

Other modal logics might represent possibility given current knowledge (so we can only access worlds not ruled out by what we already know), moral necessity and possibility ("must" vs "may"), and so on. Each has a different "seeability" relationship between worlds, which dictates a different set of logical rules governing necessity and possibility.

Just to give an example of using probability theory as a foundation for modal logic, we might equate "possible" with "has probability greater than 0," and "necessary" with "has probability 1". I doubt this simple approach is very interesting, but I know very little about this. I'm more interested in the opposite possibility.

The idea would be something like this: we attach a weight to each world, and at any particular world, we talk about the probability of an event in the worlds we can see. The probability of a fact is the sum of the weights of each seeable world in which it is true, divided by the sum of the weight of all seeable worlds.

The idea here is that probability does not make sense without a context of possible worlds. Probability within a single world doesn't make sense; each thing is either true or false (or perhaps undefined or meaningless). Furthermore, there is no privileged scope of possibilities; we might at one moment talk about probabilities assuming the context of physical possibility (in which case the flip is nearly deterministic, we just don't have enough information to calculate the result), or epistemic possibility (in which case it is just fine to assign a probability to the coin flip without sufficient knowledge of the physical conditions).

One advantage of this approach is that we get an automatic meaning for the troublesome notion of a "probability of a probability". We need this concept, for example, if we want to use bayesian learning to learn the bias of a possibly-biased coin. We assign some probability to each possible ratio of heads to tails, and then start flipping the coin and updating our belief based on the outcomes. This particular example is a bit complicated to treat with the framework I've outlined (which is perhaps a point against): we need to mix modal logics, taking one probability to be based in worlds in which the coin has different properties, and the other to only range over changes in the suurounding conditions of the coin flip. The probability-of-a-probability, then, is the sum over seeable worlds in which,in their seeable worlds, the sum of an event divided by the total sum equals a particular ratio. In the case of the coin, the two instances of "seeable" have different definitions: we jump from the current world to worlds in which the coin has different properties, but from them we jump to worlds in which the flip has different starting conditions. (Note: we know neither the coin's actual properties nor the actual exact starting conditions, so for us itis not much of a "jump". But sometimes we reason with posible worlds which are quite distinctly not our own.)

Perhaps this is not especially intuitive. One problem I can think of is the lack of justification for picking particular ranges of worlds to reason in. But it is an interesting idea.