add specification document

This commit is contained in:
crc 2024-05-23 19:45:26 +02:00
parent 764eb83a13
commit e92150f1f2

465
spec.txt Normal file
View file

@ -0,0 +1,465 @@
================================================================
crc's
_ _ _
(_) | ___ ___ ___ _ __ ___ _ __ _ _| |_ ___ _ __
| | |/ _ \ / __/ _ \| '_ ` _ \| '_ \| | | | __/ _ \ '__|
| | | (_) | | (_| (_) | | | | | | |_) | |_| | || __/ |
|_|_|\___/ \___\___/|_| |_| |_| .__/ \__,_|\__\___|_|
|_|
================================================================
ilo is a virtual computer implementing a dual stack processor
with a simple instruction set and a few i/o devices. It's small
and can be easily implemented and/or extended by an individual
programmer.
# limitations
+---------------+-----------------------+
| Item | Size |
+===============+=======================+
| memory | 65,536 cells |
| data stack | 32 numbers |
| address stack | 256 numbers |
| cell | 32-bit signed integer |
| min value | -2,147,483,647 |
| max value | 2,147,483,646 |
+---------------+-----------------------+
# memory model
ilo provides a memory space consisting of 32-bit, signed
integer values. The first address is mapped to zero, with
subsequent values following in a strictly linear fashion.
Each addressable 32-bit unit is called a *cell*. There is no
support in the instruction set for accessing values larger or
smaller than a single cell.
# registers
ilo maintains an internal `instruction pointer` register. This
document will refer to it as IP. There may also be internal
stack pointers or other registers.
No registers are exposed via the instruction set.
# image file
On startup, ilo will load an *image file* ("rom"). This is a
flat, linear sequence of signed integer values. On disk images
are stored in little endian format.
The first value loaded is mapped to address zero, and subsequent
values are loaded to sequential addresses in memory.
An image file does not contain copies of the stacks or any
internal registers.
# endian
The image and block files are stored in little endian format.
# stacks
There are two stacks, one for general data and one for holding
return addresses for calls. The return (or address) stack can
be used to temporarily hold values as well.
Stacks are LIFO (last in, first out).
# instruction set
ilo has 30 instructions. In short, these are:
+----+----+--------+-------------------------------------------+
| Op | Nm | Stack | Description |
+====+====+========+===========================================+
| 00 | .. | - | non-op |
| 01 | li | -n | push value in following cell to stack |
| 02 | du | n-nn | duplicate top stack item |
| 03 | dr | n- | discard top stack item |
| 04 | sw | ab-ba | swap top two stack items |
| 05 | pu | n- | move top stack item to address stack |
| 06 | po | -n | move top address stack item to data stack |
| 07 | ju | a- | jump to an address |
| 08 | ca | a- | call a function |
| 09 | cc | fa- | call a function if the flag is non-zero |
| 10 | cj | fa- | jump to a function if the flag is non-zero|
| 11 | re | - | return from a call or conditional call |
| 12 | eq | ab-f | compare two values for equality. a == b |
| 13 | ne | ab-f | compare two values for inequality. a != b |
| 14 | lt | ab-f | compare two values for less than. a < b |
| 15 | gt | ab-f | compare two values for greater than. a > b|
| 16 | fe | a-n | fetch a stored value in memory |
| 17 | st | na- | store a value into memory |
| 18 | ad | ab-c | add two numbers. a + b |
| 19 | su | ab-c | subtract two numbers. a - b |
| 20 | mu | ab-c | multiply two numbers. a * b |
| 21 | di | ab-cd | divide and get remainder. a % b, a / b |
| 22 | an | ab-c | bitwise and |
| 23 | or | ab-c | bitwise or |
| 24 | xo | ab-c | bitwise xor |
| 25 | sl | ab-c | shift left. a << b |
| 26 | sr | ab-c | shift right. a >> b |
| 27 | cp | sdn-f | compare two memory regions |
| 28 | cy | sdn- | copy memory |
| 29 | io | n- | perform i/o operation |
+----+----+--------+-------------------------------------------+
A condensed summary table:
Opode Instruction Names Data Stack Effects
===== ================= ====================================
00-05 .. li du dr sw pu - -n n-nn n- nm-mn n-
06-11 po ju ca cc cj re -n a- a- fa- fa- -
12-17 eq ne lt gt fe st nn-f nn-f nn-f nn-f a-n na-
18-23 ad su mu di an or nn-n nn-n nn-n nn-nn nn-n nn-n
24-29 xo sl sr cp cy io nn-n nn-n nn-n nnn- nnn- n-
===== ================= ====================================
And in detail:
+--------+------+----------------------------------------------+
| Opcode | Name | Action |
+========+======+==============================================+
| 00 | .. | Non-op. Does nothing; used for padding in |
| | | instruction bundles |
+--------+------+----------------------------------------------+
| 01 | li | Push value in the following cell to the stack|
| | | |
| | | Increment IP, then fetch the value in memory |
| | | at IP and place on the stack. |
| | | |
| | | ip += 1; |
| | | push(memory[ip]); |
+--------+------+----------------------------------------------+
| 02 | du | Duplicate the top value on the stack. |
| | | |
| | | sp += 1; |
| | | data[sp] = data[sp - 1]; |
| | | |
| | | If there is not a value on the stack an |
| | | underflow occurs. If there are 32 an |
| | | overflow occurs. |
+--------+------+----------------------------------------------+
| 03 | dr | Discard the top stack item. |
| | | |
| | | sp -= 1; |
| | | |
| | | If no items are on the stack, an underflow |
| | | occurs. |
+--------+------+----------------------------------------------+
| 04 | sw | Swap the top two items on the stack. |
| | | |
| | | a = data[sp]; |
| | | b = data[sp - 1]; |
| | | data[sp - 1] = a; |
| | | data[sp] = b; |
| | | |
| | | If there are not at least two values on the |
| | | stack, and underflow occurs. |
+--------+------+----------------------------------------------+
| 05 | pu | Move the top stack item to the address stack.|
| | | |
| | | rp += 1; |
| | | addresses[rp] = data[sp]; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 06 | po | Move the top address stack item to the data |
| | | stack. |
| | | |
| | | sp += 1; |
| | | data[sp] = addresses[rp]; |
| | | rp -= 1; |
+--------+------+----------------------------------------------+
| 07 |ju | jump to an address |
| | | modifies the instruction pointer |
| | | |
| | | Due to how the execution cycle works |
| | | (IP is incremented at the end of the |
| | | instruction bundle processing), the |
| | | address needs to be ajusted to account |
| | | for this. |
| | | |
| | | ip = data[sp] - 1; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 08 | ca | call a function |
| | | modifies the instruction pointer |
| | | |
| | | A call is like the `ju`mp instruction, but |
| | | adds the original IP to the address stack. |
| | | |
| | | rp += 1; |
| | | address[rp] = ip; |
| | | ip = data[sp] - 1; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 09 | cc | call a function if the flag is non-zero |
| | | modifies the instruction pointer |
| | | |
| | | Conditional calls are like `ca`lls, but |
| | | factor in a flag. |
| | | |
| | | target = pop(); |
| | | if (pop() != 0) { |
| | | rp += 1; |
| | | address[rp] = ip; |
| | | ip = target - 1; |
| | | } |
+--------+------+----------------------------------------------+
| 10 | cj | jump to a function if the flag is non-zero |
| | | modifies the instruction pointer |
| | | |
| | | Conditional jumps are like `ju`mp, but |
| | | factor in a flag. |
| | | |
| | | target = pop(); |
| | | if (pop() != 0) { |
| | | ip = target - 1; |
| | | } |
+--------+------+----------------------------------------------+
| 11 | re | return from a call or conditional call |
| | | modifies the instruction pointer |
| | | |
| | | ip = address[rp]; |
| | | rp -= 1; |
+--------+------+----------------------------------------------+
| 12 | eq | compare two values for equality |
| | | |
| | | if (data[sp - 1] == data[sp]) |
| | | data[sp - 1] = -1; |
| | | else |
| | | data[sp - 1] = 0; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 13 | ne | compare two values for inequality. |
| | | |
| | | if (data[sp - 1] != data[sp]) |
| | | data[sp - 1] = -1; |
| | | else |
| | | data[sp - 1] = 0; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 14 | lt | compare two values for less than. |
| | | |
| | | if (data[sp - 1] < data[sp]) |
| | | data[sp - 1] = -1; |
| | | else |
| | | data[sp - 1] = 0; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 15 | gt | compare two values for greater than. |
| | | |
| | | if (data[sp - 1] > data[sp]) |
| | | data[sp - 1] = -1; |
| | | else |
| | | data[sp - 1] = 0; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 16 | fe | fetch a stored value in memory |
| | | |
| | | data[sp] = memory[data[sp]]; |
+--------+------+----------------------------------------------+
| 17 | st | store a value into memory |
| | | |
| | | memory[data[sp]] = data[sp - 1]; |
| | | sp -= 2; |
+--------+------+----------------------------------------------+
| 18 | ad | add two numbers |
| | | |
| | | data[sp - 1] += data[sp]; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 19 | su | subtract two numbers |
| | | |
| | | data[sp - 1] -= data[sp]; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 20 | mu | multiply two numbers |
| | | |
| | | data[sp - 1] *= data[sp]; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 21 | di | divide and get remainder. |
| | | |
| | | Take two values from the data stack. The top |
| | | item is the dividend, and the second item is |
| | | the divisor. Perform the division, and push |
| | | the quotient and remainder to the stack. |
| | | |
| | | After execution the quotient will be on top, |
| | | with the remainder below it. |
| | | |
| | | *Division is symmetric, not floored*. |
| | | |
| | | a = data[sp]; |
| | | b = data[sp - 1]; |
| | | data[sp] = b / a; |
| | | data[sp - 1] = b % a; |
+--------+------+----------------------------------------------+
| 22 | an | bitwise and |
| | | |
| | | data[sp - 1] = data[sp - 1] & data[sp]; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 23 | or | bitwise or |
| | | |
| | | data[sp - 1] = data[sp - 1] | data[sp]; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 24 | xo | bitwise xor |
| | | |
| | | data[sp - 1] = data[sp - 1] ^ data[sp]; |
| | | sp -= 1; |
+--------+------+----------------------------------------------+
| 25 | sl | shift bits left |
| | | |
| | | data[sp - 1] = data[sp - 1] << data[sp]; |
| | | sp -= 1; |
| | | |
| | | The results of a negative shift are |
| | | implementation specific. |
| | | |
| | | The results of large shifts are implementaion|
| | | specific. |
+--------+------+----------------------------------------------+
| 26 | sr | shift bits right |
| | | |
| | | data[sp - 1] = data[sp - 1] >> data[sp]; |
| | | sp -= 1; |
| | | |
| | | The results of a negative shift are |
| | | implementation specific. |
| | | |
| | | The results of large shifts are implementaion|
| | | specific. |
+--------+------+----------------------------------------------+
| 27 | cp | compare two memory regions |
| | | |
| | | len = data[sp]; |
| | | dest = data[sp - 1]; |
| | | src = data[sp - 2]; |
| | | flag = -1; |
| | | while (len) { |
| | | if (memory[dest] != memory[src]) flag = 0; |
| | | len -= 1; |
| | | src += 1; |
| | | dest += 1; |
| | | }; |
| | | sp -= 2; |
| | | data[sp] = flag; |
+--------+------+----------------------------------------------+
| 28 | cy | copy memory |
| | | |
| | | len = data[sp]; |
| | | dest = data[sp - 1]; |
| | | src = data[sp - 2]; |
| | | while (len) { |
| | | memory[dest] = memory[src]; |
| | | len -= 1; |
| | | src += 1; |
| | | dest += 1; |
| | | }; |
| | | sp -= 3; |
+--------+------+----------------------------------------------+
| 29 | io | perform i/o operation |
| | | |
| | | An ilo system must support a few specific |
| | | i/o devices. These are discussed in the i/o |
| | | section of this specification. |
+--------+------+----------------------------------------------+
# instruction bundling
ilo allows up to four instructions to be packed into a single
memory location. These are processed in order. Instructions that
modify the instruction pointer (other than `li`) can not be
followed by anything other than a non-op to avoid unpredictable
behavior.
Instructions are unpacked from right to left. E.g., in C, the
instruction processor can be written like:
void process_opcode_bundle(int opcode) {
process(opcode & 0xFF);
process((opcode >> 8) & 0xFF);
process((opcode >> 16) & 0xFF);
process((opcode >> 24) & 0xFF);
}
Where `process()` takes and executes a single instruction.
# instruction processing
IP is set to zero, and execution begins. For each cycle, the
opcode bundle at IP in memory is executed. At the end of each
cycle, IP is incremented. Execution ends if IP exceeds the
length of memory. In C:
while(ip < 65536) { process_opcode_bundle(memory[ip++]); }
# I/O
A standard ilo has a limited set of I/O devices. These are:
+------------+--------------------------------------+----------+
| I/O Device | Action | Required |
+============+======================================+==========+
| 0 | Display a character. Consumes a | Yes |
| | character from the stack. | |
| 1 | Read a character from the input | Yes |
| | device. The character is pushed to | |
| | the stack. | |
| 2 | Read a block into memory. | Yes |
| 3 | Write memory to a block. | Yes |
| 4 | Write all memory to an image/rom. | No |
| 5 | Reload the image/rom, and jump to | Yes |
| | address 0. Also empty all stacks. | |
| 6 | End execution. On a hosted system, | No |
| | exit ilo. If native, suspend | |
| | execution. | |
| 7 | Obtain stack depths. Pushes the data | Yes |
| | depth then the address depth. | |
+------------+-------------------------------------------------+
I/O numbers below 12 are reserved. For custom I/O extensions,
use a number 12 or above.
# Block Storage
A generic block storage device is provided.
Each block is 1,024 memory cells (4,096 bytes) in length.
Reading a block:
- push the block number
- push an address of the buffer in memory that will hold the
block contents after the read completes
- push the value 2 (read block contents) to the stack
- call I/O operation via the `io` instruction
Writing a block:
- push the block number
- push an address of the buffer in memory that holds the
data to write to the block
- push the value 3 (write block contents) to the stack
- call I/O operation via the `io` instruction
# block file format
An implementation of ilo is free to choose the best approach to
implementing the actual block storage. For the reference model,
a flat file is used, with each block being stored sequentially.
To locate a block, multiply the block number by 4096 and seek
that offset into the file. Then read or write 4096 bytes,
packing or unpacking from memory as needed.
As with the image, the block file in the reference model is
stored in little endian format.