BIG, but still finite!
This is my "second bite" at the idea of splitting some more complex instructions into multiple variations each with their own row and column in the matrix, when building a matrix for a self-interpreter.
I mentioned previously that during my attempt to build the matrix corresponding to um.um, I had initially thought it would be possible to handle the conditional branching (for example in the um.um code when emulating opcodes 1 and 2) by having two colums/rows instead of just one for each of those opcodes. For opcode 1, one case would be for when register B was zero and one for when it was not. But for some reason I ended up convincing myself that this idea didn't work. I'm not sure why now. Perhaps I was focussing too much on being able to manually build the matrix from a straightforward static analysis of the code? Anyhow, a few days later, I've now convinced myself that I was just plain wrong back then, that it can work after all, and as a bonus the same basic idea can be extended to solve some other problems also. Does this sound convincing? :-)
Apologies to "dherbert" who also suggested this in an email to the ICFP contest discussion list after I'd described my attempts to construct the matrix in the same forum. My response to that suggestion was simply to repeat what I'd already concluded was true - that the idea didn't work - instead of thinking it through again more carefully!
My original description of building a matrix and obtaining an "eigenratio" for a particular self-interpreter and a simple instruction set assumed that each instruction had some fixed cost (e.g. execution time) and also that there was a single execution path in the interpreter to emulate each instruction regardless of operands, etc. That allowed each instruction to map directly onto a single column in the matrix.
The UM machine and um.um caused problems with this simplistic approach in two ways. Different execution paths (when emulating opcode 1, 2, and 12) plus variable costs (for example, when emulating opcode 12 and register B is not zero um.um has a loop copying an variable amount of data from one array to another). Even if we only wanted to count the instructions being executed, the looping when emulating opcode 12 can't be correctly handled with a single column in the matrix.
However, if we can find a way to build a matrix that has a separate row and column for each possible "variation" for these kinds of more complex instructions, then we can broaden the types of instruction sets and associated self-interpreters for which we are confident a corresponding matrix does exist (and can, in theory be constructed). It may not always be practical to actually construct it as was possible for the small and simple cases in my earlier experiments. But if we know a matrix does exist, then we can also be confident that the interpreter has some fixed eigenratio. (Possibly subject to other requirements also, so that we can be sure the Power Method converges, etc.)
If we are looking at instruction sets that can only address a finite amount of memory, and the range of the operands for the appropriate instructions is also bounded, then I now believe each opcode (for a more complex machine such as the UM) that has some variations in its execution cost or other variable behaviour that requires conditional branching to emulate, can be considered as a finite (albeit potentially very large) number of discrete "variations" where each such variation has a fixed cost and simple behaviour. This lets us (in a theoretical sense) describe how to build a corresponding matrix where each such variation has its own row and column. Taken to an extreme, one way to think about this is to encode the opcode and operands and any other "significant information" into one large string of bits, and then have a unique row and column in the matrix for each such unique bitstring.
Consider opcode 12 when register B is not zero. In um.um this appears to be is emulated via some code that includes a loop to copy the contents of the appropriate source array. The size can (in theory) range from 0 to (2^32 - 1) platters. So it seems our full matrix for um.um will probably need (2^32 + 1) rows and columns to handle all "variations" of opcode 12! One for when register B is zero, and 2^32 for all the other cases. (I would list the final matrix here but my monitor isn't wide enough. :-)
Knowing that a specific matrix and eigenratio exists for each member of some broad class of self-interpreters and associated langauges and machine architectures opens the possibility of applying established results from matrix theory to our thinking about self-interpreters because we will have established a kind of isomorphism. For example, although I've yet to explore very far down this path, if we can build a matrix for a particular self-interpreter, then running another program with that interpreter can in some sense be described as an Affine Transformation on the appropriate space. One thing that I've learned from that link is the technique of combining the "translation vector" with the main "transformation" matrix by adding an extra column (and row). That seems a very nice way to directly include the effect of any start-up code that your self-interpreter might have. (Any code outside of the main interpreter loop really.)
Working through a real example is probably a good idea at this stage to make this more explicit. Perhaps it is time to revist the um.um matrix. Hopefully I won't hit the wall again in the same place that I did previously!
I mentioned previously that during my attempt to build the matrix corresponding to um.um, I had initially thought it would be possible to handle the conditional branching (for example in the um.um code when emulating opcodes 1 and 2) by having two colums/rows instead of just one for each of those opcodes. For opcode 1, one case would be for when register B was zero and one for when it was not. But for some reason I ended up convincing myself that this idea didn't work. I'm not sure why now. Perhaps I was focussing too much on being able to manually build the matrix from a straightforward static analysis of the code? Anyhow, a few days later, I've now convinced myself that I was just plain wrong back then, that it can work after all, and as a bonus the same basic idea can be extended to solve some other problems also. Does this sound convincing? :-)
Apologies to "dherbert" who also suggested this in an email to the ICFP contest discussion list after I'd described my attempts to construct the matrix in the same forum. My response to that suggestion was simply to repeat what I'd already concluded was true - that the idea didn't work - instead of thinking it through again more carefully!
My original description of building a matrix and obtaining an "eigenratio" for a particular self-interpreter and a simple instruction set assumed that each instruction had some fixed cost (e.g. execution time) and also that there was a single execution path in the interpreter to emulate each instruction regardless of operands, etc. That allowed each instruction to map directly onto a single column in the matrix.
The UM machine and um.um caused problems with this simplistic approach in two ways. Different execution paths (when emulating opcode 1, 2, and 12) plus variable costs (for example, when emulating opcode 12 and register B is not zero um.um has a loop copying an variable amount of data from one array to another). Even if we only wanted to count the instructions being executed, the looping when emulating opcode 12 can't be correctly handled with a single column in the matrix.
However, if we can find a way to build a matrix that has a separate row and column for each possible "variation" for these kinds of more complex instructions, then we can broaden the types of instruction sets and associated self-interpreters for which we are confident a corresponding matrix does exist (and can, in theory be constructed). It may not always be practical to actually construct it as was possible for the small and simple cases in my earlier experiments. But if we know a matrix does exist, then we can also be confident that the interpreter has some fixed eigenratio. (Possibly subject to other requirements also, so that we can be sure the Power Method converges, etc.)
If we are looking at instruction sets that can only address a finite amount of memory, and the range of the operands for the appropriate instructions is also bounded, then I now believe each opcode (for a more complex machine such as the UM) that has some variations in its execution cost or other variable behaviour that requires conditional branching to emulate, can be considered as a finite (albeit potentially very large) number of discrete "variations" where each such variation has a fixed cost and simple behaviour. This lets us (in a theoretical sense) describe how to build a corresponding matrix where each such variation has its own row and column. Taken to an extreme, one way to think about this is to encode the opcode and operands and any other "significant information" into one large string of bits, and then have a unique row and column in the matrix for each such unique bitstring.
Consider opcode 12 when register B is not zero. In um.um this appears to be is emulated via some code that includes a loop to copy the contents of the appropriate source array. The size can (in theory) range from 0 to (2^32 - 1) platters. So it seems our full matrix for um.um will probably need (2^32 + 1) rows and columns to handle all "variations" of opcode 12! One for when register B is zero, and 2^32 for all the other cases. (I would list the final matrix here but my monitor isn't wide enough. :-)
Knowing that a specific matrix and eigenratio exists for each member of some broad class of self-interpreters and associated langauges and machine architectures opens the possibility of applying established results from matrix theory to our thinking about self-interpreters because we will have established a kind of isomorphism. For example, although I've yet to explore very far down this path, if we can build a matrix for a particular self-interpreter, then running another program with that interpreter can in some sense be described as an Affine Transformation on the appropriate space. One thing that I've learned from that link is the technique of combining the "translation vector" with the main "transformation" matrix by adding an extra column (and row). That seems a very nice way to directly include the effect of any start-up code that your self-interpreter might have. (Any code outside of the main interpreter loop really.)
Working through a real example is probably a good idea at this stage to make this more explicit. Perhaps it is time to revist the um.um matrix. Hopefully I won't hit the wall again in the same place that I did previously!