Tuesday, July 29, 2008

What's Up?

Last time, I gave a fairly simple logic that satisfies me in its ability to refer to mathematical concepts that fall in the realm mathematicians call the Arithmetical Hierarchy. Fun. But perhaps you think something is fishy-- it isn't too hard to design a deduction system that uses this logic I've described, but it doesn't seem like it would be able to do anything that a normal logic engine can't do. What's up?

Perhaps you're thinking that I'm being a bit philosophical, that I'm not worrying about something that actually matters to people trying to implement an AI system. The issue sounds pretty abstract. "Is the logic grounded? Does it Refer?"

Well, anyway, here's why it actually matters.

The logic doesn't do much by itself. But it can be conveniently augmented by a nonmonotonic reasoning system like the one I talked about previously. What's nice is that the nonmonotonic theory is automatically specified by the structure of the first-order-logic-plus-lambda-calculus theory, automatically taking care of some issues that were troubling me. (If I wanted to make that nonmonotonic logic look like a normal nonmonotonic logic, then I would need to add some strange control operators specifying a nesting of nonmonotonicness corresponding to the arithmetical levels. Maybe that is OK. But it gets worse if you ask what happens when someone comes along and arranges those nesting operators in a way that creates recursion.)

A computable predicate is a sentence containing unbound terms, halting lambda terms, and predicates that can be determined immediately (such as equality for numbers that are given in a standardized notation). Also, boolean operations on such predicates. When we add qunatifiers to these statements, we can easily determine where they fall on the arithmetical hierarchy (although we cannot guarantee that this is the lowest order, since there could be a simpler way of writing the same thing, and it could fall on a lower order.) This allows us to use the procedure previously defined to nonmonotonically reason about all statements.

Another way that this union of lambda calculus and first-order logic could matter would be in inductive systems. If a system learns models based on the combined logic, it might be able to usefully differentiate between models that would be equivalent otherwise. As I discussed in the last post, the difference is that the semantics of lambda calculus assumes a closed world in which no new rules or propositions will be added, while the first-order world is fundamentally open. A theory that uses lambdas is therefore strictly more specific than the pseudo-equivalent first-order theory that is able to make all the same positive deductions; the first-order theory fails to rule out some things that the theory with lambda can. How can we take advantage of this?

Well. Via probability theory, a more specific theory should make the actual data more likely, by ruling out a greater number of alternative possibilities. To get this to work for us, we've got to actually enforce the open-world-ness of pure first-order logic when we assign probabilities. This is not typically something people worry too much about. It is easy to see this as a bit philosophical; I might be fixing a problem that people do not have, because they go ahead and make closed-world assumptions where they need to in practice. But I am describing a formal way of looking at these closed-world assupmptions, and that seems useful.

Monday, July 21, 2008

New Grounding Results

In this post I detailed a different (and hopefully better) grounding requirement for an AI logic. I've now come up with a logic that satisfies the requirement, based on the speculations in that post.

This logic is nicely simple. All I do is add lambda calculus to first-order logic. The lambda calculus is used as a new way of notating terms, acting like function-symbols act normally. The difference is that lambda-calculus is able to uniquely define computations, while normal function symbols cannot.

The reason first-order logic cannot uniquely define computations is that it cannot rule out nonstandard interpretations. Let's say we try to define some method of computation (such as lambda calculus, Turing machines, etc.). Determining the next step in the computation is always very simple; it wouldn't be a very useful definition of computation otherwise. And since a computation is nothing but a series of these simple steps, it might seem like all we need to do is define what happens in a single step. Once we do that, the logic knows what to do next at each stage, and can carry any particular computation from start to finish without error.

The problem here is that we are only specifying what the computation does do, not what it doesn't. In other words, the first-order logic will carry out the computation correctly, but it will not "know" that the steps it is taking are the only steps the computation takes, or that the result it gets is the only result of that computation. This may sound inane, but first-order logic makes no assumptions. It only knows what we tell it.

The problem is, with just first-order logic, there is no way to say what we need to here. If there was an "and that's all" operator, we could list the positive facts about computations, and say "that's all". The logic would then know that any computational step not derivable from those basic steps never happens. But there is no such operator. I can tell it various things about what the computation does not do, but I will never finish my list.

Another way of putting this is that we can positively characterize computation, but we cannot negatively characterize it. Why not? Well, first-order logic is complete; this means that the rules of deduction can find all implications of statements we make. If we were able to state all negative facts, it would be able to find all consequences of them. But this would go against fundamental facts of computation. In particular, we know there is no way of deducing which computations eventually halt. If computations could be totally negatively characterized in a complete logic like first-order logic, the deduction rules would be capable of telling us this. So, this must be impossible.

In a sense, the reason we can't characterize computations negatively is because first-order logic has an open-world assumption. This means that if the deduction rules do not prove a statement true or false, it could be either. This is as opposed to a closed-world assumption, which would mean that if a statement is not proven true, it must be false.

So how does inserting lambda calculus into the mess help?

Lambda calculus is sufficiently closed to solve the problem. While a first-order theory can be added to without changing the meaning of what's already been stated, a program (stated in lambda calculus) cannot be altered freely. If we modify it, we simply have a different program on our hands. This means we can completely characterize many things that first-order logic alone cannot.

Fleshing out some more details of the logic:

-The deduction system should be a mix of a regular first-order deduction system, and a set of rules for the normal manipulation of the lambda calculus. This is as opposed to axioms describing how to manipulate the lambda calculus. Axioms would be in first-order logic and would therefore do us no good in terms of my grounding requirement, because they would not negatively characterize. The rules negatively characterize because they cannot be added to. (This is a petty distinction in a way; in practice, it wouldn't matter much, since nobody would add more facts to a running system concerning the lambda-calculus manipulation. But they could, and the system would accept these facts, whereas they can't add new rules without changing the system.)

-There is a question about how much to tell the first-order logic about its embedded lambda calculus. We could tell it nothing. But it seems practical for the system to have explicit first-order knowledge about the lambda-terms, so that it could deduce some facts about their manipulation (perhaps deducing facts about the results of certain calculations more quickly than could be explicitly checked by running all of those calculations). However, I have already said that we could keep telling the system more and more facts about computation, and yet never finish. So where do we stop? I have two ideas here. First, it seems reasonable to tell the system only the positive facts about the lambda calculus, since the negative ones are the ones that cause trouble. A few basic negative facts could be added at whim, if desired. This is very flexible. Second, we do not need to limit ourselves to first-order logic. We have lambda calculus at our disposal, and we can use it! In addition to placing the rules of manipulation outside of the system, as deduction rules, we could encode them within the system, using lambda calculus. This would allow the system to "know the deduction rules" despite it being impossible to completely characterize them in first-order logic.

Monday, July 14, 2008

New Grounding Criteria

In this post I talk about the need for a logic's semantics to be in some sense determined by its deduction rules. If this isn't the case, then the logic is not suitable for artificial intelligence: the system's behavior will not reflect the intended meaning of the symbols. In other words, the logic's meaning needs to be "grounded" by its use.

Unfortunately, although I presented some ideas (and added more in this post) I didn't come up with a real definition of the necessary relationship between syntax and semantics.

The general idea that I had was to ground a statement by specifying a sensible procedure for determining the truth of the statement. Due to Godel's theorem, it is impossible for such a procedure to be always-correct, so I set about attempting to define approximately correct methods.

But, now, I am questioning that approach. I've decided on a different (still somewhat ambiguous) grounding requirement. Rather than requiring that the system know what to do to get a statement, I could require that it knows what to do with it once it has it. In other words, a statement that is not meaningful on its own (in terms of manipulating the world) may be meaningful in terms of other statements (it helps manipulate meaningful statements).

Again, this covers the arithmetical hierarchy (see previous). We start with the computable predicates, which are well-grounded. If I have a grounded predicate, I can make a new grounded predicate by adding a universal quantifier over one of the variables. If I knew some instance of the new predicate to be true, I would be able to conclude that an infinite number of instances of the old were true (namely, all instances obtainable by putting some number in the spot currently occupied by the universally quantified variable). Existential quantifications have a less direct grounding: if we knew an existential statement to be true, we could conclude the falsehood of a universal statement concerning the opposite predicate (meaning "there exists X for which P(X)" is grounded in "it is not the case that for all X, not P(X)").

Because we know what would be true if some statement were true, we can attempt to use the scientific method to test the truth of statements. This is essentially what I was doing with my previous attempts. However, as I mentioned then, such methods will not even eventually converge to the right answer (for the tough cases); they will keep flipping back and forth, and even the frequencies of such flipping are meaningless (otherwise we could use them to decide). Nonetheless, human nature makes us want to try, I think...

Obviously I have a bit of work to do. For example... while this provides some grounding for arithmetical statements, does it provide enough to really fix their desired meaning? (This is particularly unclear for the existential statements.) Also, what else can be characterized like this? Is this method only useful for the arithmetical truths?