From e92150f1f2b08be52f5e4c21953dc9407155d92c Mon Sep 17 00:00:00 2001 From: crc Date: Thu, 23 May 2024 19:45:26 +0200 Subject: [PATCH] add specification document --- spec.txt | 465 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 465 insertions(+) create mode 100644 spec.txt diff --git a/spec.txt b/spec.txt new file mode 100644 index 0000000..3e89c90 --- /dev/null +++ b/spec.txt @@ -0,0 +1,465 @@ +================================================================ + +crc's + _ _ _ + (_) | ___ ___ ___ _ __ ___ _ __ _ _| |_ ___ _ __ + | | |/ _ \ / __/ _ \| '_ ` _ \| '_ \| | | | __/ _ \ '__| + | | | (_) | | (_| (_) | | | | | | |_) | |_| | || __/ | + |_|_|\___/ \___\___/|_| |_| |_| .__/ \__,_|\__\___|_| + |_| + +================================================================ + +ilo is a virtual computer implementing a dual stack processor +with a simple instruction set and a few i/o devices. It's small +and can be easily implemented and/or extended by an individual +programmer. + +# limitations + ++---------------+-----------------------+ +| Item | Size | ++===============+=======================+ +| memory | 65,536 cells | +| data stack | 32 numbers | +| address stack | 256 numbers | +| cell | 32-bit signed integer | +| min value | -2,147,483,647 | +| max value | 2,147,483,646 | ++---------------+-----------------------+ + +# memory model + +ilo provides a memory space consisting of 32-bit, signed +integer values. The first address is mapped to zero, with +subsequent values following in a strictly linear fashion. + +Each addressable 32-bit unit is called a *cell*. There is no +support in the instruction set for accessing values larger or +smaller than a single cell. + +# registers + +ilo maintains an internal `instruction pointer` register. This +document will refer to it as IP. There may also be internal +stack pointers or other registers. + +No registers are exposed via the instruction set. + +# image file + +On startup, ilo will load an *image file* ("rom"). This is a +flat, linear sequence of signed integer values. On disk images +are stored in little endian format. + +The first value loaded is mapped to address zero, and subsequent +values are loaded to sequential addresses in memory. + +An image file does not contain copies of the stacks or any +internal registers. + +# endian + +The image and block files are stored in little endian format. + +# stacks + +There are two stacks, one for general data and one for holding +return addresses for calls. The return (or address) stack can +be used to temporarily hold values as well. + +Stacks are LIFO (last in, first out). + +# instruction set + +ilo has 30 instructions. In short, these are: + ++----+----+--------+-------------------------------------------+ +| Op | Nm | Stack | Description | ++====+====+========+===========================================+ +| 00 | .. | - | non-op | +| 01 | li | -n | push value in following cell to stack | +| 02 | du | n-nn | duplicate top stack item | +| 03 | dr | n- | discard top stack item | +| 04 | sw | ab-ba | swap top two stack items | +| 05 | pu | n- | move top stack item to address stack | +| 06 | po | -n | move top address stack item to data stack | +| 07 | ju | a- | jump to an address | +| 08 | ca | a- | call a function | +| 09 | cc | fa- | call a function if the flag is non-zero | +| 10 | cj | fa- | jump to a function if the flag is non-zero| +| 11 | re | - | return from a call or conditional call | +| 12 | eq | ab-f | compare two values for equality. a == b | +| 13 | ne | ab-f | compare two values for inequality. a != b | +| 14 | lt | ab-f | compare two values for less than. a < b | +| 15 | gt | ab-f | compare two values for greater than. a > b| +| 16 | fe | a-n | fetch a stored value in memory | +| 17 | st | na- | store a value into memory | +| 18 | ad | ab-c | add two numbers. a + b | +| 19 | su | ab-c | subtract two numbers. a - b | +| 20 | mu | ab-c | multiply two numbers. a * b | +| 21 | di | ab-cd | divide and get remainder. a % b, a / b | +| 22 | an | ab-c | bitwise and | +| 23 | or | ab-c | bitwise or | +| 24 | xo | ab-c | bitwise xor | +| 25 | sl | ab-c | shift left. a << b | +| 26 | sr | ab-c | shift right. a >> b | +| 27 | cp | sdn-f | compare two memory regions | +| 28 | cy | sdn- | copy memory | +| 29 | io | n- | perform i/o operation | ++----+----+--------+-------------------------------------------+ + +A condensed summary table: + + Opode Instruction Names Data Stack Effects + ===== ================= ==================================== + 00-05 .. li du dr sw pu - -n n-nn n- nm-mn n- + 06-11 po ju ca cc cj re -n a- a- fa- fa- - + 12-17 eq ne lt gt fe st nn-f nn-f nn-f nn-f a-n na- + 18-23 ad su mu di an or nn-n nn-n nn-n nn-nn nn-n nn-n + 24-29 xo sl sr cp cy io nn-n nn-n nn-n nnn- nnn- n- + ===== ================= ==================================== + +And in detail: + ++--------+------+----------------------------------------------+ +| Opcode | Name | Action | ++========+======+==============================================+ +| 00 | .. | Non-op. Does nothing; used for padding in | +| | | instruction bundles | ++--------+------+----------------------------------------------+ +| 01 | li | Push value in the following cell to the stack| +| | | | +| | | Increment IP, then fetch the value in memory | +| | | at IP and place on the stack. | +| | | | +| | | ip += 1; | +| | | push(memory[ip]); | ++--------+------+----------------------------------------------+ +| 02 | du | Duplicate the top value on the stack. | +| | | | +| | | sp += 1; | +| | | data[sp] = data[sp - 1]; | +| | | | +| | | If there is not a value on the stack an | +| | | underflow occurs. If there are 32 an | +| | | overflow occurs. | ++--------+------+----------------------------------------------+ +| 03 | dr | Discard the top stack item. | +| | | | +| | | sp -= 1; | +| | | | +| | | If no items are on the stack, an underflow | +| | | occurs. | ++--------+------+----------------------------------------------+ +| 04 | sw | Swap the top two items on the stack. | +| | | | +| | | a = data[sp]; | +| | | b = data[sp - 1]; | +| | | data[sp - 1] = a; | +| | | data[sp] = b; | +| | | | +| | | If there are not at least two values on the | +| | | stack, and underflow occurs. | ++--------+------+----------------------------------------------+ +| 05 | pu | Move the top stack item to the address stack.| +| | | | +| | | rp += 1; | +| | | addresses[rp] = data[sp]; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 06 | po | Move the top address stack item to the data | +| | | stack. | +| | | | +| | | sp += 1; | +| | | data[sp] = addresses[rp]; | +| | | rp -= 1; | ++--------+------+----------------------------------------------+ +| 07 |ju | jump to an address | +| | | modifies the instruction pointer | +| | | | +| | | Due to how the execution cycle works | +| | | (IP is incremented at the end of the | +| | | instruction bundle processing), the | +| | | address needs to be ajusted to account | +| | | for this. | +| | | | +| | | ip = data[sp] - 1; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 08 | ca | call a function | +| | | modifies the instruction pointer | +| | | | +| | | A call is like the `ju`mp instruction, but | +| | | adds the original IP to the address stack. | +| | | | +| | | rp += 1; | +| | | address[rp] = ip; | +| | | ip = data[sp] - 1; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 09 | cc | call a function if the flag is non-zero | +| | | modifies the instruction pointer | +| | | | +| | | Conditional calls are like `ca`lls, but | +| | | factor in a flag. | +| | | | +| | | target = pop(); | +| | | if (pop() != 0) { | +| | | rp += 1; | +| | | address[rp] = ip; | +| | | ip = target - 1; | +| | | } | ++--------+------+----------------------------------------------+ +| 10 | cj | jump to a function if the flag is non-zero | +| | | modifies the instruction pointer | +| | | | +| | | Conditional jumps are like `ju`mp, but | +| | | factor in a flag. | +| | | | +| | | target = pop(); | +| | | if (pop() != 0) { | +| | | ip = target - 1; | +| | | } | ++--------+------+----------------------------------------------+ +| 11 | re | return from a call or conditional call | +| | | modifies the instruction pointer | +| | | | +| | | ip = address[rp]; | +| | | rp -= 1; | ++--------+------+----------------------------------------------+ +| 12 | eq | compare two values for equality | +| | | | +| | | if (data[sp - 1] == data[sp]) | +| | | data[sp - 1] = -1; | +| | | else | +| | | data[sp - 1] = 0; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 13 | ne | compare two values for inequality. | +| | | | +| | | if (data[sp - 1] != data[sp]) | +| | | data[sp - 1] = -1; | +| | | else | +| | | data[sp - 1] = 0; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 14 | lt | compare two values for less than. | +| | | | +| | | if (data[sp - 1] < data[sp]) | +| | | data[sp - 1] = -1; | +| | | else | +| | | data[sp - 1] = 0; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 15 | gt | compare two values for greater than. | +| | | | +| | | if (data[sp - 1] > data[sp]) | +| | | data[sp - 1] = -1; | +| | | else | +| | | data[sp - 1] = 0; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 16 | fe | fetch a stored value in memory | +| | | | +| | | data[sp] = memory[data[sp]]; | ++--------+------+----------------------------------------------+ +| 17 | st | store a value into memory | +| | | | +| | | memory[data[sp]] = data[sp - 1]; | +| | | sp -= 2; | ++--------+------+----------------------------------------------+ +| 18 | ad | add two numbers | +| | | | +| | | data[sp - 1] += data[sp]; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 19 | su | subtract two numbers | +| | | | +| | | data[sp - 1] -= data[sp]; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 20 | mu | multiply two numbers | +| | | | +| | | data[sp - 1] *= data[sp]; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 21 | di | divide and get remainder. | +| | | | +| | | Take two values from the data stack. The top | +| | | item is the dividend, and the second item is | +| | | the divisor. Perform the division, and push | +| | | the quotient and remainder to the stack. | +| | | | +| | | After execution the quotient will be on top, | +| | | with the remainder below it. | +| | | | +| | | *Division is symmetric, not floored*. | +| | | | +| | | a = data[sp]; | +| | | b = data[sp - 1]; | +| | | data[sp] = b / a; | +| | | data[sp - 1] = b % a; | ++--------+------+----------------------------------------------+ +| 22 | an | bitwise and | +| | | | +| | | data[sp - 1] = data[sp - 1] & data[sp]; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 23 | or | bitwise or | +| | | | +| | | data[sp - 1] = data[sp - 1] | data[sp]; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 24 | xo | bitwise xor | +| | | | +| | | data[sp - 1] = data[sp - 1] ^ data[sp]; | +| | | sp -= 1; | ++--------+------+----------------------------------------------+ +| 25 | sl | shift bits left | +| | | | +| | | data[sp - 1] = data[sp - 1] << data[sp]; | +| | | sp -= 1; | +| | | | +| | | The results of a negative shift are | +| | | implementation specific. | +| | | | +| | | The results of large shifts are implementaion| +| | | specific. | ++--------+------+----------------------------------------------+ +| 26 | sr | shift bits right | +| | | | +| | | data[sp - 1] = data[sp - 1] >> data[sp]; | +| | | sp -= 1; | +| | | | +| | | The results of a negative shift are | +| | | implementation specific. | +| | | | +| | | The results of large shifts are implementaion| +| | | specific. | ++--------+------+----------------------------------------------+ +| 27 | cp | compare two memory regions | +| | | | +| | | len = data[sp]; | +| | | dest = data[sp - 1]; | +| | | src = data[sp - 2]; | +| | | flag = -1; | +| | | while (len) { | +| | | if (memory[dest] != memory[src]) flag = 0; | +| | | len -= 1; | +| | | src += 1; | +| | | dest += 1; | +| | | }; | +| | | sp -= 2; | +| | | data[sp] = flag; | ++--------+------+----------------------------------------------+ +| 28 | cy | copy memory | +| | | | +| | | len = data[sp]; | +| | | dest = data[sp - 1]; | +| | | src = data[sp - 2]; | +| | | while (len) { | +| | | memory[dest] = memory[src]; | +| | | len -= 1; | +| | | src += 1; | +| | | dest += 1; | +| | | }; | +| | | sp -= 3; | ++--------+------+----------------------------------------------+ +| 29 | io | perform i/o operation | +| | | | +| | | An ilo system must support a few specific | +| | | i/o devices. These are discussed in the i/o | +| | | section of this specification. | ++--------+------+----------------------------------------------+ + +# instruction bundling + +ilo allows up to four instructions to be packed into a single +memory location. These are processed in order. Instructions that +modify the instruction pointer (other than `li`) can not be +followed by anything other than a non-op to avoid unpredictable +behavior. + +Instructions are unpacked from right to left. E.g., in C, the +instruction processor can be written like: + + void process_opcode_bundle(int opcode) { + process(opcode & 0xFF); + process((opcode >> 8) & 0xFF); + process((opcode >> 16) & 0xFF); + process((opcode >> 24) & 0xFF); + } + +Where `process()` takes and executes a single instruction. + +# instruction processing + +IP is set to zero, and execution begins. For each cycle, the +opcode bundle at IP in memory is executed. At the end of each +cycle, IP is incremented. Execution ends if IP exceeds the +length of memory. In C: + + while(ip < 65536) { process_opcode_bundle(memory[ip++]); } + +# I/O + +A standard ilo has a limited set of I/O devices. These are: + ++------------+--------------------------------------+----------+ +| I/O Device | Action | Required | ++============+======================================+==========+ +| 0 | Display a character. Consumes a | Yes | +| | character from the stack. | | +| 1 | Read a character from the input | Yes | +| | device. The character is pushed to | | +| | the stack. | | +| 2 | Read a block into memory. | Yes | +| 3 | Write memory to a block. | Yes | +| 4 | Write all memory to an image/rom. | No | +| 5 | Reload the image/rom, and jump to | Yes | +| | address 0. Also empty all stacks. | | +| 6 | End execution. On a hosted system, | No | +| | exit ilo. If native, suspend | | +| | execution. | | +| 7 | Obtain stack depths. Pushes the data | Yes | +| | depth then the address depth. | | ++------------+-------------------------------------------------+ + +I/O numbers below 12 are reserved. For custom I/O extensions, +use a number 12 or above. + +# Block Storage + +A generic block storage device is provided. + +Each block is 1,024 memory cells (4,096 bytes) in length. + +Reading a block: + + - push the block number + - push an address of the buffer in memory that will hold the + block contents after the read completes + - push the value 2 (read block contents) to the stack + - call I/O operation via the `io` instruction + +Writing a block: + + - push the block number + - push an address of the buffer in memory that holds the + data to write to the block + - push the value 3 (write block contents) to the stack + - call I/O operation via the `io` instruction + +# block file format + +An implementation of ilo is free to choose the best approach to +implementing the actual block storage. For the reference model, +a flat file is used, with each block being stored sequentially. + +To locate a block, multiply the block number by 4096 and seek +that offset into the file. Then read or write 4096 bytes, +packing or unpacking from memory as needed. + +As with the image, the block file in the reference model is +stored in little endian format.