Wednesday, April 25, 2007

OK. Forget the old argument I mentioned (but never posted). I now have a new, more exact/formal/rigorous/not-wrong argument which shows the exact relationship of my method to other, known/loved methods of the pattern recognition field: markov models and hidden markov models.

(In a way, this is silly. Every time I get ahold of some real information about the field, I realize how much I've merely reinvented in primitive, approximate form. The current realization is from a book Dr. Albee lent me.)

A Markov model is a table of transition probabilities between states. This is a somewhat simplistic method of patternfinding, because it is making entries in a static table. Essentially, it's a probabilistic way of recording rote repetition. In fact, if we record the probability of each sequence of some set size, we can easily use these numbers to calculate the table entries in the markov model. These sequences are what I called "aggregates" before- so finding aggregates is precisely equivalent to finding markov models, except that certain things are easier to calculate if you've been collecting aggregates while other things are easier if you've been calculating markov models.

But, somewhat more exciting:

Whereas in a markov model, one assumes that the next state of the system depends completely on the previous states, in a hidden markov model, one instead assumes that the current state depends soley on the state of a "hidden" variable. The hidden variable follows the markov assumption: it's next state is determined by it's previous states in some probabilistic manner. But the observable variables are assumed to be dependant on this hidden variable, rather than on their own past states. It's like watching a markov-system through a frosted window; we can only determine the state of the hidden variable by observing what we can see. We must learn both the markov-model that governs the state of the hidden variable, and the probabilities of different visible-variable states given each hidden-variable state. We can't actually learn one without already having some guess about the other, so learning hidden markov models is pretty difficult.

What's cool is that "variable patterns" are equivalent to hidden markov models. A variable pattern is characterized by an unknown (but guessed-at) distribution of observable state probabilities. The system must attempt to learn transitions between variable patterns at the same time as trying to learn the variable pattern's definitions. (This is reflected by the fact that aggregates can be formed completely of variable patterns.) My system tries to learn hidden markov models automatically.

My system, therefore, has different elements that are equivalent to two related patternfinding methods. It serves as an interesting go-between; it learns regular markov models and then attepts to learn hidden markov models on top of 'em. I should research further into the issue to see if there are similar things out there. However, this doesn't completely characterize the system... at least I think it doesn't. In relating the system to markov and hidden markov models, I didn't mention all of the five points from the conclusion in the last post. I'll have to think about it.

No comments:

Post a Comment