2018-12-28 05:58:29 +01:00
|
|
|
_
|
|
|
|
__ _ _ _| |_ ___ _ __ ___ _ _
|
|
|
|
/ _` | | | | __/ _ \| '_ \/ __| | | |
|
|
|
|
| (_| | |_| | || (_) | |_) \__ | |_| |
|
|
|
|
\__,_|\__,_|\__\___/| .__/|___/\__, |
|
|
|
|
|_| |___/
|
2017-10-16 18:09:39 +02:00
|
|
|
|
2018-12-28 05:58:29 +01:00
|
|
|
Autopsy is a set of debugging tools for RETRO.
|
2017-10-16 18:09:39 +02:00
|
|
|
|
2018-12-28 05:58:29 +01:00
|
|
|
# Background
|
2017-10-16 18:09:39 +02:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
RETRO runs on a virtual machine called Nga. The instruction set
|
|
|
|
is MISC inspired, consisting of just 30 instructions:
|
2017-10-16 18:09:39 +02:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
0 .. 5 pu 10 re 15 fe 20 di 25 zr
|
|
|
|
1 li 6 po 11 eq 16 st 21 an 26 en
|
|
|
|
2 du 7 ju 12 ne 17 ad 22 or 27 ie
|
|
|
|
3 dr 8 ca 13 lt 18 su 23 xo 28 iq
|
|
|
|
4 sw 9 cc 14 gt 19 mu 24 sh 29 ii
|
2018-12-28 05:58:29 +01:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
The first two characters of each instruction name are sufficient
|
|
|
|
to identify the instruction.
|
2017-10-16 18:09:39 +02:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
Nga exposes memory as an array of 32-bit signed integers. Each
|
|
|
|
memory location can store four instructions. The assembler
|
|
|
|
expects the instructions to be named using their two character
|
|
|
|
identifiers. E.g.,
|
2017-10-16 18:09:39 +02:00
|
|
|
|
2018-04-12 18:19:32 +02:00
|
|
|
'lica.... i
|
|
|
|
#100 d
|
2017-10-16 18:09:39 +02:00
|
|
|
|
2018-12-28 05:58:29 +01:00
|
|
|
# Disassembly
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
I use '..' for 'no(p)' and then construct a string with all of
|
|
|
|
these. This will be used to resolve names. The ?? at the end
|
|
|
|
will be used for unidentified instructions.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
2018-12-28 05:22:19 +01:00
|
|
|
'..lidudrswpupojucaccreeqneltgtfestadsumudianorxoshzrenieiqii??
|
2018-04-12 18:19:32 +02:00
|
|
|
'INST s:const
|
|
|
|
~~~
|
2017-10-16 18:09:39 +02:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
Since instructions are packed, I need to unpack them before I
|
|
|
|
can run or display the individual instructions. I implement
|
|
|
|
`unpack` for this.
|
2017-10-16 18:09:39 +02:00
|
|
|
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
2017-10-16 18:09:39 +02:00
|
|
|
{{
|
|
|
|
:mask #255 and ;
|
|
|
|
:next #8 shift ;
|
|
|
|
---reveal---
|
|
|
|
:unpack (n-dcba)
|
|
|
|
dup mask swap next
|
|
|
|
dup mask swap next
|
|
|
|
dup mask swap next
|
2018-04-12 18:19:32 +02:00
|
|
|
'abcd 'dcba reorder ;
|
2017-10-16 18:09:39 +02:00
|
|
|
}}
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
Now it's possible to write words to display instruction
|
|
|
|
bundles. The formats are kept simple. For a bundle with
|
|
|
|
`lit / lit / add / lit`, this will display either the
|
|
|
|
opcodes (`1,1,17,1`) or a string with the abbreviations
|
|
|
|
(`liliadli`).
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
:name-for (n-cc)
|
2018-12-28 05:22:19 +01:00
|
|
|
#30 n:min #2 * &INST + fetch-next swap fetch swap ;
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
:display:bundle<raw> (n-)
|
2018-05-07 18:36:37 +02:00
|
|
|
unpack '%n,%n,%n,%n s:format s:put ;
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
:display:bundle<named> (n-)
|
2018-05-07 18:36:37 +02:00
|
|
|
unpack #4 [ name-for c:put c:put ] times ;
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
So now I'm ready to write an actual disassembler. I'll
|
|
|
|
provide an output formatted like this:
|
2018-04-12 18:19:32 +02:00
|
|
|
|
2018-12-28 05:58:29 +01:00
|
|
|
(address) 'instructionbundle i
|
|
|
|
(address) #value d (possibly_`reference`)
|
2018-04-12 18:19:32 +02:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
If the value corresponds to a word in the `Dictionary`, the
|
|
|
|
disassembler will display a message indicating the possible
|
|
|
|
name that corresponds to the value.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
To begin, I'll add a variable to track the number of `li`
|
|
|
|
instructions. (These require special handling as they push a
|
|
|
|
value in the following cells to the stack).
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
'LitCount var
|
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
I then wrap `name-for` with a simple check that increments
|
|
|
|
`LitCount` as needed.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
:name-for<counting-li> (n-cc)
|
|
|
|
dup #1 eq? [ &LitCount v:inc ] if name-for ;
|
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
To actually display a bundle, I need to decide on what it is. So
|
|
|
|
I have a `validate` word to look at each instruction and make
|
|
|
|
sure all are actual instructions.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
:valid? (n-f)
|
|
|
|
unpack
|
2018-12-28 05:22:19 +01:00
|
|
|
[ #0 #29 n:between? ] bi@ and
|
|
|
|
[ [ #0 #29 n:between? ] bi@ and ] dip and ;
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
With this and the `LitCount`, I can determine how to render
|
|
|
|
a bundle.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
I split out each type (instruction, reference/raw, and data) into
|
|
|
|
a separate handler.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
:render-inst (n-)
|
2018-05-07 18:36:37 +02:00
|
|
|
$' c:put unpack #4 [ name-for<counting-li> c:put c:put ] times sp $i c:put ;
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
:render-data (n-)
|
2018-05-07 18:36:37 +02:00
|
|
|
$# c:put n:to-string s:put sp $d c:put ;
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
:render-ref (n-)
|
|
|
|
dup d:lookup-xt n:-zero?
|
2018-12-28 05:40:04 +01:00
|
|
|
[ dup render-data
|
|
|
|
d:lookup-xt d:name '\t\t(possibly\_`%s`) s:format s:put ]
|
2018-04-12 18:19:32 +02:00
|
|
|
[ render-data ] choose ;
|
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
Then I use these and my `valid?` checker to implement a single
|
|
|
|
word to render the packed cell in a meaningful manner.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
2017-10-16 18:09:39 +02:00
|
|
|
:render-packed (n-)
|
2018-04-12 18:19:32 +02:00
|
|
|
@LitCount n:zero?
|
|
|
|
[ dup valid?
|
|
|
|
[ render-inst ]
|
|
|
|
[ render-ref ] choose ]
|
|
|
|
[ render-ref &LitCount v:dec ] choose ;
|
|
|
|
~~~
|
|
|
|
|
|
|
|
And now to tie it all together:
|
|
|
|
|
|
|
|
~~~
|
|
|
|
'TryToIdentifyWords var
|
2017-10-16 18:09:39 +02:00
|
|
|
|
|
|
|
:disassemble (an-)
|
2018-12-28 05:22:19 +01:00
|
|
|
#0 !LitCount
|
2018-04-12 18:19:32 +02:00
|
|
|
[
|
|
|
|
@TryToIdentifyWords
|
|
|
|
[ dup d:lookup-xt n:-zero?
|
2018-05-07 18:36:37 +02:00
|
|
|
[ dup d:lookup-xt d:name nl s:put nl ] if ] if
|
2018-04-12 18:19:32 +02:00
|
|
|
fetch-next
|
2019-04-10 14:07:01 +02:00
|
|
|
over $( c:put n:dec n:put $) c:put sp (address)
|
2018-04-12 18:19:32 +02:00
|
|
|
render-packed nl (inst_or_data)
|
2017-10-16 18:09:39 +02:00
|
|
|
] times drop ;
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2018-12-28 05:58:29 +01:00
|
|
|
# Execution Trace and Single Stepper
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
Ok, now on to the fun bit: execution trace and single stepping
|
|
|
|
through a word.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
This entails writing an implementation of Nga in RETRO. So to
|
|
|
|
start, setup space for the data and address ("return") stacks,
|
|
|
|
as well as variables for the stack pointers and instruction
|
|
|
|
pointer.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
2019-03-08 18:39:38 +01:00
|
|
|
'DataStack d:create #128 allot
|
2018-12-28 05:58:29 +01:00
|
|
|
'ReturnStack d:create #768 allot
|
2018-04-12 18:19:32 +02:00
|
|
|
'SP var
|
|
|
|
'RP var
|
|
|
|
'IP var
|
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
I also set up space for the actual RAM. When single stepping
|
|
|
|
or tracing, Autopsy will copy the actual RAM to this before
|
|
|
|
proceeding. Note that this is limited to 128K cells.
|
|
|
|
|
|
|
|
~~~
|
2019-03-08 22:49:32 +01:00
|
|
|
FREE #3 / dup 'IMAGE const
|
|
|
|
FREE swap - 'IMAGE-SIZE const
|
2019-03-08 21:32:42 +01:00
|
|
|
~~~
|
|
|
|
|
|
|
|
Next, helpers to push values from the real stacks to the
|
|
|
|
simulated ones. The stack pointer will point to the next
|
|
|
|
available cell, not the actual top element.
|
2017-10-25 14:06:01 +02:00
|
|
|
|
|
|
|
~~~
|
2019-02-15 19:51:35 +01:00
|
|
|
:>s (n-) &DataStack @SP + store &SP v:inc ;
|
|
|
|
:s> (-n) &SP v:dec &DataStack @SP + fetch ;
|
|
|
|
:>r (n-) &ReturnStack @RP + store &RP v:inc ;
|
|
|
|
:r> (-n) &RP v:dec &ReturnStack @RP + fetch ;
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
One more helper, `[IP]` will return the value in memory at the
|
|
|
|
location `IP` points to.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
:[IP] @IP fetch ;
|
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
Now for the instructions. I have a separate word for each
|
|
|
|
instruction and then a jump table of addresses that point to
|
|
|
|
these.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
2019-02-15 19:51:35 +01:00
|
|
|
:i:no ;
|
|
|
|
:i:li &IP v:inc [IP] >s ;
|
|
|
|
:i:du s> dup >s >s ;
|
|
|
|
:i:dr s> drop ;
|
2019-03-08 18:39:38 +01:00
|
|
|
:i:sw s> s> swap >s >s ;
|
2019-02-15 19:51:35 +01:00
|
|
|
:i:pu s> >r ;
|
|
|
|
:i:po r> >s ;
|
|
|
|
:i:ju s> n:dec !IP ;
|
|
|
|
:i:ca @IP >r i:ju ;
|
2019-03-08 18:39:38 +01:00
|
|
|
:i:cc s> s> [ >s i:ca ] [ drop ] choose ;
|
2019-02-15 19:51:35 +01:00
|
|
|
:i:re r> !IP ;
|
|
|
|
:i:eq s> s> eq? >s ;
|
|
|
|
:i:ne s> s> -eq? >s ;
|
|
|
|
:i:lt s> s> swap lt? >s ;
|
|
|
|
:i:gt s> s> swap gt? >s ;
|
2019-03-08 21:32:42 +01:00
|
|
|
:i:fe s> #-1 [ @SP >s ] case
|
|
|
|
#-2 [ @RP >s ] case
|
2019-03-08 22:49:32 +01:00
|
|
|
#-3 [ IMAGE-SIZE >s ] case
|
|
|
|
IMAGE + fetch >s ;
|
|
|
|
:i:st s> s> swap IMAGE + store ;
|
2019-02-15 19:51:35 +01:00
|
|
|
:i:ad s> s> + >s ;
|
|
|
|
:i:su s> s> swap - >s ;
|
|
|
|
:i:mu s> s> * >s ;
|
|
|
|
:i:di s> s> swap /mod swap >s >s ;
|
|
|
|
:i:an s> s> and >s ;
|
|
|
|
:i:or s> s> or >s ;
|
|
|
|
:i:xo s> s> xor >s ;
|
|
|
|
:i:sh s> s> swap shift >s ;
|
|
|
|
:i:zr s> dup n:zero? [ drop i:re ] [ >s ] choose ;
|
|
|
|
:i:en #0 !RP ;
|
|
|
|
:i:ie #1 >s ;
|
|
|
|
:i:iq #0 dup >s >s ;
|
|
|
|
:i:ii s> s> nip c:put ;
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
With the instructions defined, populate the jump table. The
|
|
|
|
order is crucial as the opcode number will be the index into
|
|
|
|
this table.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
'Instructions d:create
|
2018-12-28 05:58:29 +01:00
|
|
|
&i:no , &i:li , &i:du , &i:dr , &i:sw , &i:pu ,
|
|
|
|
&i:po , &i:ju , &i:ca , &i:cc , &i:re , &i:eq ,
|
|
|
|
&i:ne , &i:lt , &i:gt , &i:fe , &i:st , &i:ad ,
|
|
|
|
&i:su , &i:mu , &i:di , &i:an , &i:or , &i:xo ,
|
|
|
|
&i:sh , &i:zr , &i:en , &i:ie , &i:iq , &i:ii ,
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
With the populated table of instructions, implementing a
|
|
|
|
`process-single-opcode` is easy. This will check the
|
|
|
|
instruction to make sure it's valid, then call the
|
|
|
|
corresponding handler in the instruction table. If not
|
|
|
|
valid, this will report an error.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
:process-single-opcode (n-)
|
2018-12-28 05:28:39 +01:00
|
|
|
dup #0 #29 n:between?
|
2018-04-12 18:19:32 +02:00
|
|
|
[ &Instructions + fetch call ]
|
2018-05-07 18:36:37 +02:00
|
|
|
[ 'Invalid_Instruction:_%n_! s:format s:put nl ] choose ;
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
Next is to unpack an instruction bundle and process each
|
|
|
|
instruction.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
:process-packed-opcode (n-)
|
|
|
|
unpack
|
|
|
|
process-single-opcode
|
|
|
|
process-single-opcode
|
|
|
|
process-single-opcode
|
|
|
|
process-single-opcode ;
|
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
So the guts of the Nga-in-RETRO are done at this point. Next
|
|
|
|
is a method of stepping through execution of a word.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
This will display output indicating state. It'll provide:
|
|
|
|
|
|
|
|
- current memory location
|
|
|
|
- values in instruction bundle
|
|
|
|
- stack depths
|
|
|
|
- data stack before execution
|
|
|
|
- data stack after exection
|
|
|
|
|
|
|
|
E.g.,
|
|
|
|
|
|
|
|
IP:13966 SP:3 RP:1
|
|
|
|
[19,0,0,0] - mu......
|
|
|
|
Stack: 9 3 3 -> 9 9
|
|
|
|
|
|
|
|
So helpers for displaying things:
|
|
|
|
|
|
|
|
~~~
|
|
|
|
:display-data-stack
|
2019-03-08 21:32:42 +01:00
|
|
|
'DS:_ s:put
|
2018-05-07 18:36:37 +02:00
|
|
|
#0 @SP [ &DataStack over + fetch n:put sp n:inc ] times drop ;
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
:display-return-stack
|
2019-03-08 21:32:42 +01:00
|
|
|
'RS:_ s:put
|
2018-05-07 18:36:37 +02:00
|
|
|
#0 @RP [ &ReturnStack over + fetch n:put sp n:inc ] times drop ;
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
And then using the display helpers and instruction processor, a
|
|
|
|
single stepper. (This also updates a `Steps` counter)
|
2018-04-12 18:19:32 +02:00
|
|
|
|
|
|
|
~~~
|
|
|
|
'Steps var
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
:take-step
|
2019-02-15 19:51:35 +01:00
|
|
|
[IP] process-packed-opcode &IP v:inc
|
|
|
|
&Steps v:inc ;
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
{{
|
|
|
|
:sep #63 [ $- c:put ] times nl ;
|
|
|
|
:named-word? @IP d:lookup-xt n:-zero? ;
|
|
|
|
:show-name 0; drop @IP d:lookup-xt d:name s:put nl ;
|
|
|
|
:instruction [IP] [ unpack ] sip ;
|
|
|
|
:names #4 [ name-for<counting-li> c:put c:put ] times ;
|
|
|
|
:format '%n,%n,%n,%n s:format ;
|
|
|
|
:pad s:length #16 swap - #0 n:max [ sp ] times ;
|
|
|
|
:opcodes format dup s:put pad ;
|
|
|
|
:pad dup n:to-string s:length #6 swap - #0 n:max [ sp ] times ;
|
|
|
|
:header 'IP: s:put @IP pad n:put
|
|
|
|
@RP @SP '\tSP:%n_RP:%n\t s:format s:put ;
|
|
|
|
---reveal---
|
|
|
|
:details instruction drop names tab instruction [ opcodes ] dip sp n:put nl ;
|
|
|
|
:stacks display-data-stack nl display-return-stack nl ;
|
|
|
|
:step sep named-word? show-name header details take-step stacks ;
|
|
|
|
}}
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
|
|
|
And then wrap it with `times` to run multiple steps.
|
|
|
|
|
|
|
|
~~~
|
|
|
|
:steps (n-)
|
|
|
|
&step times ;
|
|
|
|
~~~
|
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
Then on to the tracer. This will `step` through execution until
|
|
|
|
the word returns. I use a similar approach to how I handle this
|
|
|
|
in the interface layers for RETRO (word execution ends when the
|
|
|
|
address stack depth reaches zero).
|
2018-04-12 18:19:32 +02:00
|
|
|
|
2019-03-08 21:32:42 +01:00
|
|
|
The `trace` will empty the step counter and display the number
|
|
|
|
of steps used.
|
2018-04-12 18:19:32 +02:00
|
|
|
|
2019-03-08 18:39:38 +01:00
|
|
|
~~~
|
|
|
|
:copy-image
|
2019-03-08 22:49:32 +01:00
|
|
|
#0 IMAGE IMAGE-SIZE here n:min copy ;
|
2019-03-08 18:39:38 +01:00
|
|
|
~~~
|
|
|
|
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
:trace (a-)
|
2019-03-08 18:39:38 +01:00
|
|
|
copy-image
|
2019-03-08 21:32:42 +01:00
|
|
|
#0 !Steps #0 !SP
|
2019-02-15 19:51:35 +01:00
|
|
|
!IP #0 >r
|
|
|
|
[ step @RP n:zero? @IP n:negative? or ] until
|
|
|
|
nl @Steps '%n_steps_taken\n s:format s:put ;
|
2018-04-12 18:19:32 +02:00
|
|
|
~~~
|
|
|
|
|
2018-12-28 05:58:29 +01:00
|
|
|
# Tests
|
2018-04-12 18:19:32 +02:00
|
|
|
|
2019-02-15 19:51:35 +01:00
|
|
|
```
|
2018-04-12 18:19:32 +02:00
|
|
|
:test
|
|
|
|
as{ 'liliaddu i #22 d #33 d }as
|
|
|
|
#3 #4 gt? [ #1 ] if ;
|
|
|
|
|
|
|
|
#0 #100 disassemble
|
2018-05-07 18:36:37 +02:00
|
|
|
nl '-------------------------- s:put nl
|
2018-04-12 18:19:32 +02:00
|
|
|
&TryToIdentifyWords v:on
|
|
|
|
#0 #100 disassemble
|
2019-02-15 19:51:35 +01:00
|
|
|
```
|
2019-03-08 21:32:42 +01:00
|
|
|
|