Friday, March 07, 2008

I recently read an article called "Complex Systems, Artificial Intelligence and Theoretical Psychology" by Richard Loosemore. The argument it makes goes something like this:

1. A "complex system" (referring to complex systems science) is one that displays behavior that cannot be predicted analytically using the system's defining rules. (This is called, in the paper, a global-local disconnect: the local rules that play the role of the "physical laws" of the system are disconnected analytically from the global behavior displayed by the system.)

2. The mind seems to be a complex system, and intelligence seems to be a global phenomenon that is somewhat disconnected with the local processes that create it. (Richard Loosemore argues that no real proof of this can be given, even in principle; such proofs are blocked by the global-local disconnect. However, he thinks it is the case, partly because no analytical solution for the mind's behavior has been found so far.)

3. The mind therefore has global-local disconnect. Richard Loosemore argues that, therefore, artificial intelligence cannot be achieved by a logical approach that attempts to derive the local rules from the list of desired global properties. Instead, he proposes an experimental approach to artificial intelligence: researchers should produce and test a large number of systems based on intuitions about what will produce results, rather than devoting years to the development of single systems based on mathematical proofs of what will produce results.

I agree with some points and disagree with others, so I'll try to go through the argument approximately in order.

First, I do take the idea of a complex system seriously. Perhaps the idea that the global behavior of some systems cannot be mathematically predicted seems a bit surprising. It IS surprising. But it seems to be true.

My acceptance of this idea is due in part to an eye-opening book I recently read, called "Meta math: the quest for omega", by Gregory Chaitin.

Chaitin's motivation is to determine why Godel's Theorem, proving the incompleteness of mathematics, is true. He found Godel's proof convincing, but not very revealing: it only gives a single counterexample, a single meaningful theorem that is true but mathematically unprovable. But this theorem is a very strange-sounding one, one that nobody would ever really want. Perhaps it was the only place where math failed, or perhaps math would only fail in similarly contrived cases. Chaitin wanted some indication of how bad the situation was. So he found an infinite class of very practical-sounding, useful theorems, all but a handful of which are unreachable by any formal logic! Terrible!

Perhaps I'll go through the proof in another post.

Chaitin shows us, then, that there really are global properties that are analytically unreachable. In fact, in addition to his infinite class of unreachable-but-useful theorems, he talks about a global property that any given programming language will have, but which is analytically unreachable: the probability that a randomly generated program will ever produce any output. This probability has some very interesting properties, but I won't go into that.

I think the term Chaitin used for math's failure is somewhat more evocative than "global-local disconnect". He uses the term "irreducible mathematical fact", a fact of mathematics that is "true for no reason", at least no logical reason. Notice that he still refers to them as mathematical facts, because they are true statements about mathematical entities. As with other mathematical facts, it still seems as if their truth would be unchanged in any possible world. Yet, they are "irreducible": logically disconnected from the body of provable facts of mathematics.

So, math sometimes cannot tell us what we want to know, even when we are asking seemingly reasonable questions about mathematically defined entities. Does this mean anything about artificial intelligence?

Another term related to global-local disconnect, this one mentioned in the paper by Loosemore, is "computational irreducibility". This term was introduced by Stephan Wolfram. The idea is that physics is able to predict the orbits of the planets and the stress on a beam in an architectural design because these physical systems are computationally reducible; we can come up with fairly simple equations that abbreviate (with high accuracy) a huge number of physical interactions. If they were not reducible, we would be forced to simulate each atom to get from the initial state to the final result. This is the situation that occurs in complex systems. Unlike the "mathematically irreducible" facts just discussed, there IS a way to get the answer: run the system. But there are no shortcuts. This form of global-local disconnect is considerably easier to deal with, but it's still bad enough.

It's this second kind of irreducibility, computational irreducibility, that I see as more relevant to AI. Take the example of a planning AI. To find the best plan, it must search through a great number of possibilities. Smarter AIs will try to reduce the computation by ruling out some possibilities, but significant gains can be made only if we're willing to take a chance and rule out possibilities we're not sure are bad. The computation, in other words, is irreducible-- we'll have a sort of global-local disconnect simply because if we could predict the result from the basic rules, we wouldn't need the AI to find it for us.

So it seems Loosemore was right: intelligence does seem to involve complexity, and unavoidably so. But this sort of irreducibility clearly doesn't support the conclusion that Loosemore draws! The global-local disconnect seems caused by taking a logical approach to AI, rather than somehow negating it.

But there is a way to salvage Loosemore's position, at least to an extent. I mentioned briefly the idea of shortcutting an irreducible computation by compromising, allowing the system to produce less-than-perfect results. In the planning example, this meant the search didn't check some plans that may have been good. But for more complicated problems, the situation can be worse; as we tackle harder problems, the methods must become increasingly approximate.

When Loosemore says AI, he means AGI: artificial general intelligence, the branch of AI devoted to making intelligent machines that can deal with every aspect of the human world, rather than merely working on specialized problems like planning. In other words, he's talking about a really complicated problem, one where approximation will be involved in every step.

Whereas methods of calculating the answer to a problem often seem to follow logically from the problem description, approximations usually do not. Approximation is hard. It's in this arena that I'm willing to grant that, maybe, the "logical approach" fails, or at least becomes subservient to the experimental approach Loosemore argues for.

So, I think there is a sort of split: the "logical" approach applies to the broad problem descriptions (issues like defining the prior for a universal agent), and to narrow AI applications, but the "messy" approach must be used in practice on difficult problems, especially AGI.


  1. And in other breaking news, a perpetual motion machine is impossible! Experts convene to discuss implications for energy industry.

    Seriously, we've known since Turing that computation in general is a complex system - no way to analytically predict the behavior of the whole from description of the parts. This is not an obscure piece of knowledge - the standard way to prove irreducibility for systems like cellular automata is to show that they support universal computation. Nor is it without practical significance - it's well known that we can't mathematically prove software correct (subroutines, occasionally; whole systems, no). Programmers are already aware of this, but it doesn't stop us writing programs that work in the real world; there's no reason it should stop us writing that particular subset of programs that fall under the heading of AI.

  2. I think part of the problem is that since this blog has no audience, I make no effort to decide whether I'm writing to the specialist or the man on the street.

    But more relevantly, proving properties of an algorithm is a standard everyday practice in computer science. The conflict between "logical" and "messy" should be understood as the mathematical approach, in which the error bounds and convergence speed is carefully proven, vs the experimental approach, in which we find these by using test data.

  3. Well, even semi-technical writing about AI is perforce at least somewhat for the specialist.

    But on the matter of program proving, as I said: "subroutines, occasionally; whole systems, no" - we don't have any disagreement there, I think? It's inevitably the case that by the time you've deployed software for real use, you've had to make extensive use of test data, whatever you may or may not have also done in terms of proving particular algorithms.

  4. Sure, we are in agreement there. I'm just thinking that proving properties of an abstract algorithm is more important and of more theoretical interest. (And easier, since less implementation issues need to be considered.)

  5. Your response to my "Complex Systems, Artificial Intelligence and Theoretical Psychology" paper is interesting and thoughtful (thank you for that), however, your interpretation of the paper drifts away from the core message that I was trying to convey. The result is that you come to some conclusions that don't seem to really confront the disaster that I was warning about in my text.

    The first point that I want to make is about my claim that "intelligent systems are complex systems". I mean something very specific when I say that, but it is not really the same as the version you attribute to me. So let me try to explain what I mean.

    I mean that if we were to take a census of all the different kinds of system that people have described as "complex", we would find that these systems have something in common at the level of their driving mechanisms - not at the behavior level, notice, but at the mechanism level. What they have in common is that their driving mechanisms tend to have a cluster of properties that I summarize with the phrase "tangled and nonlinear" - the low level units of the system do nasty, pathological things like depending on one another in ways that cannot be written down with clean equations that look solvable to a mathematician.

    I would be the first to admit that this commonality in the low level mechanisms is very hard to define, and that there is no fixed set of features that always give rise to complexity. However, just as a matter of purely empirical, down and dirty observation, there is something that we can point to and say "If a system has a lot of these features in its local mechanisms, we would very strongly expect the overall behavior of the system to show some features that are regular-but-unanalyzable".

    And, of course, what I claim is that intelligent systems are complex because they have these low level features.

    Why is it important for me to emphasize this? Because there are many other ways to interpret the statement that "intelligent systems are complex systems", and these other interpretations often lead to discussions that I do not want to get drawn into. For example, I do not want to argue that specific high level behaviors (like AI planning) involve complicated, non-provably-correct algorithms, and that THEREFORE intelligence is complex. I don't believe in that 'therefore' link at all, and I don't want to get caught trying to defend it. Apart from anything else, I would then have to face a particular kind of response, which goes something like "Okay, so it might seem to Loosemore that it is too complicated for us to understand such mechanisms today, but we are gradually making progress at improving our designs - for example, by building better and better AI planning systems - so the argument that there is some kind of fundamental Complex Systems Problem is nothing more than Loosemore's unjustified pessimism about our rate of progress".

    Instead, my point of attack is at a much deeper level. If you look at the overall features of all the proposals that anyone has ever made about how to build an intelligent system, you will see, buried somewhere in the details, an inevitable drift toward features that are on the "Complexity Hitlist" - the low level mechanisms that tend to give rise to complexity at the high level. Some people do try to claim that there are none of these features in their designs, but usually this is only because they are avoiding all discussion of how their systems develop: as soon as development mechanisms are included, the complexity-inducing features turn up with a vengeance. Most AGI researchers (Ben Goertzel, for example) don't have any trouble with this - they are quite happy to agree that these complexity-inducing features are present, and that some kind of complexity is to be expected.

    So my claim is pitched at a very general level. We should expect complexity because the most fundamental aspects of all intelligent systems seem to involve features that almost always give rise to complexity. That is all.

    The second (and, in fact, final) stage of my argument is that if we think about how the complexity might manifest itself in intelligent systems, we can come to some interesting and important conclusions about the methodology we use for Artificial Intelligence research. Putting it in nontechnical terms, my claim goes like this: the complexity problem is NOT going to manifest itself by walking right through the front door and saying "Hi, I am a specific problem caused by complexity, and I am more-or-less compartmentalized from the rest of your design, so you had better assign some of your sharpest minds to deal with me!". Rather, the problem will manifest itself in a subtle way that is simply never going to be noticed if we continue to use the methodology that dominates AI (or AGI) research today.

    What do I mean by it manifesting itself in a 'subtle way'? I mean that some AGI design decisions can trap you in a corner of the AGI Design Space (so to speak) where there are no solutions that lead to full, human level intelligence. Further, it may well be the case that all of the feasible AGI design solutions (like, all the feasible AGI solutions in the entire universe) involve systems in which the components interact using a sprinkling of mechanisms that make no earthly sense whatsoever, at the high level.

    To put this last idea in concrete terms, we may find that the only way to build an intelligence is to have basic units representing objects and actions, with these units interacting in ways that seem roughly understandable (e.g. they may have relationship links, and they may obey some plausible-looking learning mechanisms), but in addition to all this stuff that make sense to us, there may be some mechanisms that maintain certain funky parameters that have absolutely no meaning at the high level. Crucially, we could never have deduced the existence of these funky mechanisms, and we will never be able to explain why they work, but they act as a kind of glue that makes the system work, and without them there is no way to tweak the system to make it get up to the human intelligence level. Significantly, there may not be very many of these mechanisms, but they may be crucial nonetheless.

    My point is this: if we look in an objective way at complex systems in general, we would expect the existence of such funky mechanisms (there are plenty of complex systems that are nothing but funky, incomprehensible mechanisms!), so why is it that all of Artificial Intelligence is predicated on the idea that we can start with a design that seems to work in narrow circumstances, and then generalize it and tweak it to work in the harder, more general circumstances? Why do we use that type of approach when everything we know about other complex systems indicates that it would never work in those cases?

    The nature of my attack, then, is not at the specific level of (e.g.) mathematical incomputability, but at a very much deeper, empirical level. I point to deep reasons to believe in the complexity of intelligence, and then I point out the absurdity of using the regular AI methodology to engage in the "design" of complex systems. And then, for my conclusion, I ask: if that methodology is so absurd in the regular complex-system case, why on earth are we using it in the case of AI or AGI research?

    So in the end this is an argument about the practical issues of a scientific and engineering methodology.

    The final kick in the argument comes when we ask what would happen if my interpretation of the situation turns out to be correct, but we do nothing about it. If we continue on our present course and ignore the thing that I have called the 'Complex Systems Problem', would we eventually come across some clear evidence that Loosemore's little complaint was right after all? The horrible truth is that we would never come across any clear evidence. We would always find it easier to convince ourselves that all difficulties we are experiencing are due to the fact that we have not yet done enough work on the problem, and that the AI problem is just harder than we thought. We would run around in circles, starting new approaches and then abandoning them after they got bogged down, making small amounts of progress in specific areas but never quite being able to knit those separate approaches together, never making much progress in the general learning field, and so on. There might be entire conferences and journals devoted to vast amounts of theorem proving, and there might be people proposing abstract idealizations of the AI problem defined over infinite sets of possible universes, but with al of this not actually leading to much real world progress toward AGI.

    In short, we would expect exactly what we have been experiencing for the last 50 years of AI research.

    I think this is a disaster.