## Saturday, January 19, 2008

I've (almost) decided what I mean by completeness. The basic statement is this:

"If a system is Godel-incomplete, it's as complete as it can be."

So if a system is rich enough to run into Godel's Incompleteness Theorem, it's good enough (Apologies for the lack of two dots above the 'o' in the name, by the way.).

This is probably not exactly true. Godel's Theorem can apply to different systems with somewhat varying levels of expressiveness, so long as whole numbers are representable. But it's at least a fair goal.

Here are some more precise thoughts that followed from the idea.

Because I'm focusing on learning/induction, statements are essentially probabilistic rather than definitely true or false. Bayesian networks, Markov networks, factor graphs, and other such things (which generally fall under the heading "graphical models") constitute a well-researched, popular, and practical field of machine learning. These probabilistic models can be thought of as the boolean algebra of probability theory (aka the propositional logic of probability). For those without a logic background, this means (very approximately) that we can make statements relating the truth of one idea to the truth of another, and can deduce more relationships that follow from these, but cannot reason about the internal structure of ideas.

The next step up is called 1st-order predicate calculus. Predicate calculus allows us to make statements about entities, rather than merely making statements. Statements about entities have some internal structure, because we can refer to different entities. We can also refer to all entities at once (stating universal properties).

If propositional calculus correspond to the graphical models, what is the probabilistic counterpart to predicate calculus? The answer is Probabilistic Relational Models. (They can talk about the relationships of entities to eachother!) These seem very much like a good thing, something I'd like to get my hands on and mess with.

But there's a bit of a catch. The "1st order" part of "1st order predicate calculus" means that the things we can talk about are divided into two categories: basic entities, and predicates. We can make statements about entities using predicates, but we are not allowed to make statements about predicates themselves; we can only use them.

This setup, along with the terminology "1st order", "2nd order", "3rd..." "4th..." and so on, is part of type theory. Type theory is a way of avoiding paradox. The 1st-order predicates talk about the base entities. There are 2nd-order predicates that can be used to talk about 1st-order predicates. The 3rd order talks about the 2nd. And so on. Logic was originally supposed to have all of the orders, but nowadays it's mostly restricted to a low number; 1st-order logic is highly typical, 2nd-order is occasionally used when 1st isn't enough, and on occasion 3rd is needed.

1st order logic by itself is not strong enough to be subject to Godel's theorem. The theorem was originally written concerning logic with all the orders (but also applies to other systems). Thus, extending probabilistic relational models to higher-order logic is an obvious step to fulfill my requirement for expressiveness. However, we can do even better.

Type theory has an obvious oddness to it. If we apply it to natural language, we get the following ideas:

-We have nouns.
-Nouns have properties, such as "white" and "heavy", and also relationships to other nouns, such as "under" and "attacking".
-There are special words that talk about properties and relationships; for example, a property can be a physical property or various other types, it can be a color/weight/etc; similarly for relationships.
-So on for higher types.
-However, we must separate certain things that intuitively would be the same. For example, an entity can be "important". This is a 1st-order property. But if a property is "important", this is a 2nd-order property. These two cannot be merged. We must also have a 3rd-order importance for 2nd-order properties, a 4th-order importance, and so on. The same thing happens to many other intuitive concepts: they must be split into an infinite number of concepts, one for each level.

Another problem that makes me reject type theory is that it cannot be described in its own terms. Any description of type theory must make simultaneous reference to all of the levels, which is not allowed.

Luckily, there is an alternative. Type theory can be seen as a very restricted form of set theory. 1st-order properties can be seen as sets of entities; 2nd-order properties as sets of 1st-order properties; and so on. Relations are a bit harder to translate, but can be seen as sets of ordered lists. So it's simple to drop the restrictions imposed by type theory, giving a more intuitive higher-order logic.

So can probabilistic models be made in this more general domain? I'm tempted to answer "sure". I don't know a whole lot about probabilistic relational models, and I don't know a whole lot about set theory, but (perhaps in my ignorance) I see no inherent contradiction between the two. It seems simple enough to extend the one into the other. In fact, I'd say it captures many of the ideas I've talked about in the past. (Shall we say, captures those ideas worth capturing?)

One problem remains: what prior? (My impression is that probabilistic relational models sidestep this issue by talking about working algorithms more than theoretical ideals... perhaps one reason why they are a matter of great practical interest! But I tend to think the ideal is important as well.)