Friday, August 17, 2007

Over the summer, I've been working on a more rigorous mathematical underpinning for the system. In particular, I wanted a way to measure the significance of aggregate objects in a totally correct way, rather than the many ad-hoc methods I'd been coming up with previously. I've come up with something fairly satisfying, but the details involve taking integrations over multiple variables so I won't try to describe them here.

Anyway, that explains the neglect of this blog.

But now, I've come up with a much more satisfying way of extending the theory to learn turing-complete patterns.

The system can learn the behavior of the "internals" of a turing machine, because a turing machine on the inside is a simply-implemented thing. The power of computation comes from applying simple rules over and over to get possibly complicated results. As I've said before, the problem comes when the internals of such a complex process are hidden from view, and the system must learn the resulting behavior without being able to see the causes. In this case, my system by itself may not be strong enough to learn the general pattern-- it could only memorize special cases.

The new solution to this is based on the idea of a "hidden variable". Hidden variables come into play both in what's called a "hidden markov model", and sometimes in Bayes Net learning, and in other places (I imagine). A hidden variable is one that is assumed to exist, but whose value cannot be directly observed through the senses, only inferred from them.

Hidden variables are obviously what's needed to make my system representationally powerful enough to define any pattern. The question is, when should a hidden variable be introduced to explain an observation? The system should be able to learn not just individual hidden variables, but infinite spaces of them; this corresponds to the theoretically infinite tape of a turing machine, and also to the very human tendency to reason about space as an essentially infinite thing. When thinking about the world, we don't assume only areas we've been to exist: we freely imagine that there is some area at any distance, in any direction. This corresponds to an infinitely extendable system of hidden variables.

Actually, the "infinitely extendable" part isn't what's hard (although it leads to some obvious hard computational issues). Once I figure out what conditions lead the system to consider the existence of a hidden variable, infinite systems of hidden variables can be inferred from the already-existing system whenever a pattern behind the finite number of hidden variables implies such a system. So the important question is when the system should create hidden variables.

Ideally, the system might first create hidden variables to represent phenomena such as an object still existing if a robot moves its camera away and then back, then create hidden variables to represent things staying the same when it moved around the room and back, and to a different room and then back... and when the world is what is moving, then hidden variables might represent what's literally hidden behind other objects. That last one is interesting, because it *could* be done with correlations between non-hidden structures, but it obviously *shouldn't* be. A pixel being in a particular state before the leading edge of an object covers it correlates with it returning to that state when the trailing edge of the object uncovers it; however, it seems far better to correlate it with a hidden variable representing the color of a particular object that can only sometimes be seen. This is particularly important if hidden objects are interacting; if they are represented as hidden objects, the system has a chance of figuring out what the result of the interaction will be.