Autopsy is a debugging tool for RETRO. This is a fresh implementation for RETRO 12 and is intended to be a useful debugging tool.
In implementing this, I identified the core elements I wanted:
- the ability to study memory
- dumps
- disassembly
- the ability to edit memory
- provided by RETRO already via fetch/store and the assembler
- the ability to run a word in a sandbox
- the ability to single step through a word
- the ability to profile instruction frequency
- the ability to watch specific variables and memory locations
This is more ambitious than my prior debuggers. But as I intend to use RETRO 12 for years to come, it'll be a necessary and worthwhile tool.
So some background on the internals.
RETRO runs on a virtual machine called Nga. The instruction set is MISC inspired, consisting of just 27 instructions:
0 nop 7 jump 14 gt 21 and
1 lit <v> 8 call 15 fetch 22 or
2 dup 9 ccall 16 store 23 xor
3 drop 10 return 17 add 24 shift
4 swap 11 eq 18 sub 25 zret
5 push 12 neq 19 mul 26 end
6 pop 13 lt 20 divmod
Four instructions are packed per 32-bit memory location. The assembler allows the instructions to be specified like:
'lica.... i
#100 d
I shorten the instructions to two letter abbreviations, with '..' for 'nop' and then construct a string with all of these. This will be used to resolve names. The ?? at the end will be used for unidentified instructions.
Since instructions are packed, I need to unpack them before I can run the individual instructions. I implement `unpack` for this.
~~~
{{
:mask #255 and ;
:next #8 shift ;
---reveal---
:unpack (n-dcba)
dup mask swap next
dup mask swap next
dup mask swap next
'abcd 'dcba reorder ;
}}
~~~
Now it's possible to write words to display instruction bundles. The formats are kept simple. For a bundle with `lit / lit / add / lit`, this will display either the opcodes (`1,1,17,1`) or a string with the abbreviations (`liliadli`).
If the value corresponds to a word in the `Dictionary`, the disassembler will display a message indicating the possible name that corresponds to the value.
To begin, I'll add a variable to track the number of `li` instructions. (These require special handling as they push a value in the following cells to the stack).
~~~
'LitCount var
~~~
I then wrap `name-for` with a simple check that increments `LitCount` as needed.
~~~
:name-for<counting-li> (n-cc)
dup #1 eq? [ &LitCount v:inc ] if name-for ;
~~~
To actually display a bundle, I need to decide on what it is. So I have a `validate` word to look at each instruction and make sure all are actual instructions.
~~~
:valid? (n-f)
unpack
[ #0 #26 n:between? ] bi@ and
[ [ #0 #26 n:between? ] bi@ and ] dip and ;
~~~
With this and the `LitCount`, I can determine how to render a bundle.
I split out each type (instruction, reference/raw, and data) into a separate handler.
Ok, now on to the fun bit: execution trace and single stepping through a word.
This entails writing an implementation of Nga in RETRO. So to start, setup space for the data and address ("return") stacks, as well as variables for the stack pointers and instruction pointer.
~~~
'DataStack d:create #1024 allot
'ReturnStack d:create #1024 allot
'SP var
'RP var
'IP var
~~~
Next, helpers to push values from the real stacks to the simulated ones. The stack pointer will point to the next available cell, not the actual top element.
One more helper, `[IP]` will return the value in memory at the location `IP` points to.
~~~
:[IP] @IP fetch ;
~~~
Now for the instructions. Taking a cue from the C implementation, I have a separate word for each instruction and then a jump table of addresses that point to these.
With the populated table of instructions, implementing a `process-single-opcode` is easy. This will check the instruction to make sure it's valid, then call the corresponding handler in the instruction table. If not valid, this will report an error.
And then wrap it with `times` to run multiple steps.
~~~
:steps (n-)
&step times ;
~~~
Then on to the tracer. This will `step` through execution until the word returns. I use a similar approach to how I handle this in the interface layers for RETRO (word execution ends when the address stack depth reaches zero).
The `trace` will empty the step counter and display the number of steps used.