The Mark II Brainf*ck Self-Interpreter
I obsess, therefore I am...
This will be short and sweet, because all I really want to do is to "release" my second version of a faster BF self-interpreter to the world at large.
The commented version of cgbfi2.b is probably easier to understand but here's a lean and mean version anyway!
Isn't it lovely? :-)
In a nutshell, cgbfi2.b is a derivative of cgbfi1.b that replaces "++", ">>", "<<", "[-]", "[>]", "[<]", "[>>]" and "[<<]" sequences in its input program with some new single value codes internally. This means those sequences are treated as single instructions in the main interpreter loop, thus requiring only one return trip between the emulated code and data in each case instead of (often) many more. In addition this strategy reduces the size of memory footprint of the emulated code, which also helps.
Of course, cgbfi2.b is a lot larger than cgbfi1.b (1758 instructions versus 950 once the comments are stripped out) but cgbfi2.b effectively sees itself as a 1178 instruction program after those common sequences described above are "compiled" to single instructions in memory. 1178 is still larger than 950 but the other gains made by the reduced trips to the data and back at runtime generally make this new version much faster. The other negative is that this new interpreter does more work when loading a program (it was hard work recognising those few short sequences!) so it can be slower than other interpreters when running some shorter/simpler programs. But for larger and more complex programs (including itself!) it is definitely much faster.
In terms of finding an eigenratio, it appears that this new interpreter might have an eigenratio of somewhere in the "low thousands". This guesstimate is again only based on results from running a few very small programs on a three level deep stack of the interpreter. But if this is not too far from the real value then it seems a fast PC might manage to run a four level stack in a few days. I'll have to find a faster machine than mine to test this with!
However I did manage to run a 4 level deep stack with a slight "cheat" about 10 days ago when I was experimenting with different strategies. The "cheat" was to create a version of the interpreter that expects its input to already be converted to binary values 1-8 instead of the original "][><.-,+" BF instructions, and values 9-16 for the "virtual instructions", plus 0 for the end of input marker. This means the loader routine was extremely short and simple (something like ">>>>>>+[-<,[>>>+<]>]>+<<<<[<<]>>"), but the main interpreter loop was unchanged. In that experiment, the four level stack of the "binary BF self-interpreter" appeared to be converging towards an eigenratio of about 600. Once again, on a faster machine this would seem to indicate that a 5 level stack might be able to be run (with a very simple program "on top") in a few days or so.
More later hopefully, once I find a faster machine, or someone else produces a faster faster version! For now, I think I'm just about out of ideas, energy and enthusiasm in that department - not to mention skill. Even finding a way to do basically the same thing as cgbfi2.b but in a more compact implementation should make it a lot faster.
This will be short and sweet, because all I really want to do is to "release" my second version of a faster BF self-interpreter to the world at large.
The commented version of cgbfi2.b is probably easier to understand but here's a lean and mean version anyway!
>>>>+[->>>++>+>+++++++[<++++>>++<-]++>>+>+>+++++[>++>++
++++<<-]+>>>,<++[[>[->>]<[>>]<<-]<[<]<+>>[>]>[<+>-[[<+>
-]>]<[[[-]<]++<-[<+++++++++>[<->-]>>]>>]]<<]>[-]+<<[--[
[-]>>-<<<+>]>>[-<<<<[>+<-]+<<+[->[<+>>>>>>+<<<<<-]<[>+<
-]>>>>>>>+<[-[-[-[-[-[-[-[-[[-]>-<]>[-<<+++++++>>]<]]]>
[-]<]>[-<<+++>>]<]>[-<<+>>]<]>[-]<]>[-<<<<<<<+>>>>>>>]<
]>[-]<<<<<]>>>[<<+>>-]<+<[-[-[-[-[-[-[-[-[-[-[[-]>-<<<[
-]<<+>>]]]]>[-]<]>[-<<<[-]<<+++++++>>>]<]]]>[-]<]>[-<<<
[-]<<+++++++>>>]<]]>[-]<<<<<<[-]>>[-<<<[>+>>+<<<-]>[<+>
-]>>[-[-[[-]>>]>[<<[<+>>+<-]>[<+>-]+<<-[-[-[-[-[-[-[-[-
[-[-<+>]<+++++++++++++>>[-]>->-<<]]]>>[->>>>>]<<]>>[-<<
<++++++++++++>>[-]>>-]<<]>>[->>>>>]<<]>>[-<<<++++++++++
+>>[-]>>-]<<]>>[-<<<++++++++++>>[-]>>-]<<]]>>[->>>>>]<<
]<]]>[>>]<<<]>>+>>>]<<]>>[->+>]<<<]<<[<<]>>[[<+>>+<-]+<
-[-[-[-[-[-[-[-[-[-[-[-[-[-[-[->->>[>>]>>[>>]<[<-<<-<]<
[<<]<<[<<]<]>[->>[>>]>>[>>]<[>+>>+>]<[<<]<<[<<]]<]>[->>
[>>]>>[>>]<[-]<[<<]<<[<<]]<]>[->>[>>]>>[>>]<[<-<]<[<<]<
<[<<]]<]>[->>[>>]>>[>>]<[>+>]<[<<]<<[<<]]<]>[->>[>>]>>[
>>]<<-<<-<<[<<]<<[<<]]<]>[->>[>>]>>[>>]+>>+[<<]<<[<<]]<
]>[->>[>>]>>[>>]<++<[<<]<<[<<]]<]>[->>[>>]>>[>>]<+<[<<]
<<[<<]]<]>[->>[>>]>>[>>]<,<[<<]<<[<<]]<]>[->>[>>]>>[>>]
<-<[<<]<<[<<]]<]>[->>[>>]>>[>>]<.<[<<]<<[<<]]<]>[->>[>>
]>>[>>]<<-<<[<<]<<[<<]]<]>[->>[>>]>>[>>]+[<<]<<[<<]]<]>
[->>[>>]>>[>>]<[>+>>+<<<-]>[<+>-]>>[<<+>>[-]]+<<[>>-<<-
]>>[<<+>>>>+<<-]>>[<<+>>-]<<[>>+<<-]+>>[<<->>-]<<<<[-<<
[<<]<<[<<]<<<<<++>>]>>[-<<<<<<[<<]<<[<<]<]>]<]>[->>[>>]
>>[>>]<[>+>>+<<<-]>[[<+>-]>>[-]+<<]>>[<<+>>>>+<<-]>>[<<
+>>-]<<[>>+<<-]+>>[<<->>-]<<<<[-<<[<<]<<[<<]<<<<<+>>]>>
[-<<<<<<[<<]<<[<<]<]>]>[<+>-]<<<<<<[>>+<<-[->>->>+[>>>[
-<+>>+<]+<-[-[[-]>-<]>[-<<<+>>>]<]>[-<<<->>>]>[-<+>]<<<
<[>>+<<-]>>]<<<<<<]>>[-<<+[>>>[-<+>>+<]+<-[-[[-]>-<]>[-
<<<->>>]<]>[-<<<+>>>]>[-<+>]<<<<[<<+>>-]<<]]]>>>>>>>]
Isn't it lovely? :-)
In a nutshell, cgbfi2.b is a derivative of cgbfi1.b that replaces "++", ">>", "<<", "[-]", "[>]", "[<]", "[>>]" and "[<<]" sequences in its input program with some new single value codes internally. This means those sequences are treated as single instructions in the main interpreter loop, thus requiring only one return trip between the emulated code and data in each case instead of (often) many more. In addition this strategy reduces the size of memory footprint of the emulated code, which also helps.
Of course, cgbfi2.b is a lot larger than cgbfi1.b (1758 instructions versus 950 once the comments are stripped out) but cgbfi2.b effectively sees itself as a 1178 instruction program after those common sequences described above are "compiled" to single instructions in memory. 1178 is still larger than 950 but the other gains made by the reduced trips to the data and back at runtime generally make this new version much faster. The other negative is that this new interpreter does more work when loading a program (it was hard work recognising those few short sequences!) so it can be slower than other interpreters when running some shorter/simpler programs. But for larger and more complex programs (including itself!) it is definitely much faster.
In terms of finding an eigenratio, it appears that this new interpreter might have an eigenratio of somewhere in the "low thousands". This guesstimate is again only based on results from running a few very small programs on a three level deep stack of the interpreter. But if this is not too far from the real value then it seems a fast PC might manage to run a four level stack in a few days. I'll have to find a faster machine than mine to test this with!
However I did manage to run a 4 level deep stack with a slight "cheat" about 10 days ago when I was experimenting with different strategies. The "cheat" was to create a version of the interpreter that expects its input to already be converted to binary values 1-8 instead of the original "][><.-,+" BF instructions, and values 9-16 for the "virtual instructions", plus 0 for the end of input marker. This means the loader routine was extremely short and simple (something like ">>>>>>+[-<,[>>>+<]>]>+<<<<[<<]>>"), but the main interpreter loop was unchanged. In that experiment, the four level stack of the "binary BF self-interpreter" appeared to be converging towards an eigenratio of about 600. Once again, on a faster machine this would seem to indicate that a 5 level stack might be able to be run (with a very simple program "on top") in a few days or so.
More later hopefully, once I find a faster machine, or someone else produces a faster faster version! For now, I think I'm just about out of ideas, energy and enthusiasm in that department - not to mention skill. Even finding a way to do basically the same thing as cgbfi2.b but in a more compact implementation should make it a lot faster.
0 Comments:
Post a Comment
<< Home