I ran into some interesting parallels...
This paper gives a method for finding programs that could have generated a given piece of data, by simulating a universal Turing machine running backwards from the data to the possible programs that generated it. Since a Turing machine can destroy information while running forward (for example, if it writes 0 to a part of the tape then we no longer know if there was already a 0 there or if there was a 1) running it backwards involves making arbitrary decisions (such as what to put on a square that is reverse-written-to). These different decisions will lead us to different possible programs that might have created the data. We should look for elegant programs, because these explain the pattern behind the data. But of course we do not know which paths lead to the most elegant programs, so we need to search.
The search space is rather large, but it is a huge improvement over trying to search the entire space of programs. Contrast this to Jurgen Schmidhuber's earlier approach, which throws out a program as soon as it produces a wrong output. The new approach avoids the incorrect programs altogether, so that all we need to worry about in our search is program search.
Backtracking is also easy. All we need to do is run the simulation forwards. Since the Turing machine is deterministic, we don't need to make any decisions. We can keep going forward and backward as many times as we like and still remain in the space of programs that produce the desired output.
My interest in all of this is adapting it to learning grammars. Since grammars are turing-complete representations, this is not difficult. What is interesting, though, is that when translated, there is a strong analogy between this new method and old methods for grammar learning. The old methods are restricted to a special case called context free grammars, so the new method is basically a generalization. To use a grammar, we apply its rules successively until we get the final product. The old methods for finding context-free grammars are interesting because what they do is essentially make reverse rule applications to construct a grammar from the end result. The analogy between this and running a Turing machine backwards is quite strong. Those methods for learning context-free grammars are also quite analagouse to the crazy ideas in my earlier posts on this blog.
I am interested in examining how these methods can be adapted in various ways, including taking some ideas from SEQUITUR to attempt an incremental version of the algorithm, and ideas from this paper to better guide the search (at least for context-free cases). That is, if I find the time...