Friday, March 28, 2008

Godel incompleteness is not a sufficient measure of completeness.

To begin this argument, I need to talk about Skolem's Paradox, an intriguing result concerning the foundation of mathematics.'s_paradox

Skolem's Paradox shows that the logical constructions of set theory will always be insufficient, in a way slightly worse than Godel's Incompleteness. The problem is that any logical characterization of set theory fails to characterize sets completely. The same rule applies to many entities. For example, any attempt at a logical description of the notion "Turing Machine" will similarly fail. Also "number", "connectedness", "higher-order predicate", and "logic" itself, to name a few. All of these cannot be described logically.

The basis for this claim lies in the fact that these entities cannot be described in first-order logic, which (if you haven't heard of it) is a restricted logic that doesn't fall prey to Godel's theorem (because it is not sufficiently powerful to represent its own rules, and so doesn't contain self-reference). 1st-order logic is often taught as if it were Logic, period. In fact, some have argued that this is the truth: after all, what is logic if there isn't a complete set of deduction rules? However, 1st order logic is too weak to describe the majority of mathematical entities.

Nonetheless, mathematicians try. What Skolem did was essentially call them on the error, using a nice theorem he and Lowenheim had developed. In fact, the same error is committed *whenever* a logic attempts to be anything more than 1st-order: all you're really doing is trying to make 1st-order logic talk about higher-order entities. (In a sense, this is *why* Godel's Theorem is true; that is, it's *why* the deduction rules are incomplete: they can't ever really do anything more than 1st-order logic.)

But this leaves a big gaping hole. How DO we reason about mathematical objects? If logical constructions fail, what are we doing? Math SEEMS logical. The easy way out is to act like humans have a magical ability to talk about these things, while any formal construction is doomed to fail. Luckily, this is not the only way out.

But the other way out is *much* more complicated! Sorry.

The other way begins with the notion of "limit computation". This is an idea in theoretical computer science, which (sort of) allows a computer to calculate things that it cannot normally calculate. The prime example of something a computer cannot calculate is the halting problem.

The halting problem is very much like Godel's theorem. The two seem to me to be the same theorem developed in different settings. Godel's theorem says that there will be truths about formal logic that formal logic cannot deduce, while the halting problem shows that there will be facts about computer programs that computer programs will be unable to calculate.

Perhaps it seems odd that we can have a mathematically well-defined value but be unable to compute it, just as it seems odd that we have well-defined mathematical entities with no proper logical characterization. The intriguing thing, though, is that the halting problem is "limit-computable".

A limit-computer is a computer that is allowed to revise its output. The program never halts; instead, the output "converges" to the correct answer.

This convergence occurs in a finite amount of time, but the problem is that we don't know for sure *when* it has occurred; so to get guaranteed results, we've got to wait forever. But giving up the sureness of finite-time computation allows us to construct programs that do what others cannot.

So, translate this back to logic. We can increase the number of entities we can reason about by allowing facts to be revised: we may make a wrong inference, so long as any wrong inferences will be corrected along the way. This is called a trial-and-error predicate.

But there's more! :)

Just as no halting program can solve the halting problem, no converging program can solve the "convergence problem". Just as we can ask if a normal program halts, we can ask if a limit-program converges to an answer or keeps revising its answer forever.

But just as we can solve the halting problem by resorting to limit-computers, we can solve the convergence problem by resorting to an augmented limit-computer that has access to another computer (meaning it can run as many limit-computations as it likes). Equivalently, we can give the computer as many infinities as it needs, rather than just one, to converge (which again amounts to it being able to run as many limit computations as it likes). In fact, we can give a computer larger and larger infinities of computation time, resulting in the ability to compute more and more things.

The question arises: if we give the computer "as large an infinity as it likes", can we compute any mathematically well-defined value? I do not know the answer. But it *sounds* reasonable...

If we're willing to grant this wild assumption, then we can again transfer this development to the logical domain. Essentially, we allow trial-and-error predicates to be defined in terms of other trial-and-error predicates. This gives up the guarantee that all fallible inferences will be eventually corrected (unless by "eventually" we mean "after arbitrarily large infinities of time have passed").

Why is all this in a blog about AI?

Well, if I'm right, then the "magical" quality that humans posses and formal systems do not is the ability to make fallible inferences. Any AI based in infallible logic would be unable to understand mathematics, but an AI that included a good fallible reasoning system would be able to. Perhaps this comes automatically with any good learning algorithm, but perhaps not; perhaps only learning systems with very specific properties are sufficient. This needs further research! One avenue is "nonmonotonic logic", which is very similar to the logic I'm proposing.

However, standard nonmonotonic logic doesn't have quite as much machinery as I want... I think it is equal to normal limit-computation, rather than the forms of computation involving larger infinities.

But that's enough speculation for today.

Friday, March 07, 2008

I recently read an article called "Complex Systems, Artificial Intelligence and Theoretical Psychology" by Richard Loosemore. The argument it makes goes something like this:

1. A "complex system" (referring to complex systems science) is one that displays behavior that cannot be predicted analytically using the system's defining rules. (This is called, in the paper, a global-local disconnect: the local rules that play the role of the "physical laws" of the system are disconnected analytically from the global behavior displayed by the system.)

2. The mind seems to be a complex system, and intelligence seems to be a global phenomenon that is somewhat disconnected with the local processes that create it. (Richard Loosemore argues that no real proof of this can be given, even in principle; such proofs are blocked by the global-local disconnect. However, he thinks it is the case, partly because no analytical solution for the mind's behavior has been found so far.)

3. The mind therefore has global-local disconnect. Richard Loosemore argues that, therefore, artificial intelligence cannot be achieved by a logical approach that attempts to derive the local rules from the list of desired global properties. Instead, he proposes an experimental approach to artificial intelligence: researchers should produce and test a large number of systems based on intuitions about what will produce results, rather than devoting years to the development of single systems based on mathematical proofs of what will produce results.

I agree with some points and disagree with others, so I'll try to go through the argument approximately in order.

First, I do take the idea of a complex system seriously. Perhaps the idea that the global behavior of some systems cannot be mathematically predicted seems a bit surprising. It IS surprising. But it seems to be true.

My acceptance of this idea is due in part to an eye-opening book I recently read, called "Meta math: the quest for omega", by Gregory Chaitin.

Chaitin's motivation is to determine why Godel's Theorem, proving the incompleteness of mathematics, is true. He found Godel's proof convincing, but not very revealing: it only gives a single counterexample, a single meaningful theorem that is true but mathematically unprovable. But this theorem is a very strange-sounding one, one that nobody would ever really want. Perhaps it was the only place where math failed, or perhaps math would only fail in similarly contrived cases. Chaitin wanted some indication of how bad the situation was. So he found an infinite class of very practical-sounding, useful theorems, all but a handful of which are unreachable by any formal logic! Terrible!

Perhaps I'll go through the proof in another post.

Chaitin shows us, then, that there really are global properties that are analytically unreachable. In fact, in addition to his infinite class of unreachable-but-useful theorems, he talks about a global property that any given programming language will have, but which is analytically unreachable: the probability that a randomly generated program will ever produce any output. This probability has some very interesting properties, but I won't go into that.

I think the term Chaitin used for math's failure is somewhat more evocative than "global-local disconnect". He uses the term "irreducible mathematical fact", a fact of mathematics that is "true for no reason", at least no logical reason. Notice that he still refers to them as mathematical facts, because they are true statements about mathematical entities. As with other mathematical facts, it still seems as if their truth would be unchanged in any possible world. Yet, they are "irreducible": logically disconnected from the body of provable facts of mathematics.

So, math sometimes cannot tell us what we want to know, even when we are asking seemingly reasonable questions about mathematically defined entities. Does this mean anything about artificial intelligence?

Another term related to global-local disconnect, this one mentioned in the paper by Loosemore, is "computational irreducibility". This term was introduced by Stephan Wolfram. The idea is that physics is able to predict the orbits of the planets and the stress on a beam in an architectural design because these physical systems are computationally reducible; we can come up with fairly simple equations that abbreviate (with high accuracy) a huge number of physical interactions. If they were not reducible, we would be forced to simulate each atom to get from the initial state to the final result. This is the situation that occurs in complex systems. Unlike the "mathematically irreducible" facts just discussed, there IS a way to get the answer: run the system. But there are no shortcuts. This form of global-local disconnect is considerably easier to deal with, but it's still bad enough.

It's this second kind of irreducibility, computational irreducibility, that I see as more relevant to AI. Take the example of a planning AI. To find the best plan, it must search through a great number of possibilities. Smarter AIs will try to reduce the computation by ruling out some possibilities, but significant gains can be made only if we're willing to take a chance and rule out possibilities we're not sure are bad. The computation, in other words, is irreducible-- we'll have a sort of global-local disconnect simply because if we could predict the result from the basic rules, we wouldn't need the AI to find it for us.

So it seems Loosemore was right: intelligence does seem to involve complexity, and unavoidably so. But this sort of irreducibility clearly doesn't support the conclusion that Loosemore draws! The global-local disconnect seems caused by taking a logical approach to AI, rather than somehow negating it.

But there is a way to salvage Loosemore's position, at least to an extent. I mentioned briefly the idea of shortcutting an irreducible computation by compromising, allowing the system to produce less-than-perfect results. In the planning example, this meant the search didn't check some plans that may have been good. But for more complicated problems, the situation can be worse; as we tackle harder problems, the methods must become increasingly approximate.

When Loosemore says AI, he means AGI: artificial general intelligence, the branch of AI devoted to making intelligent machines that can deal with every aspect of the human world, rather than merely working on specialized problems like planning. In other words, he's talking about a really complicated problem, one where approximation will be involved in every step.

Whereas methods of calculating the answer to a problem often seem to follow logically from the problem description, approximations usually do not. Approximation is hard. It's in this arena that I'm willing to grant that, maybe, the "logical approach" fails, or at least becomes subservient to the experimental approach Loosemore argues for.

So, I think there is a sort of split: the "logical" approach applies to the broad problem descriptions (issues like defining the prior for a universal agent), and to narrow AI applications, but the "messy" approach must be used in practice on difficult problems, especially AGI.