37a3cdfbaf
FossilOrigin-Name: a99784ce6f19f1915ff9230d3ee32f69278734560ae5c253fa76b7e55848bf77
1159 lines
30 KiB
Markdown
1159 lines
30 KiB
Markdown
# RETRO: a Modern, Pragmatic Forth
|
|
|
|
Welcome to RETRO, my personal take on the Forth language. This
|
|
is a modern system primarily targetting desktop, mobile, and
|
|
servers, though it can also be used on some larger (ARM, MIPS32)
|
|
embedded systems.
|
|
|
|
The language is Forth. It is untyped, uses a stack to pass data
|
|
between functions called words, and a dictionary which tracks
|
|
the word names and data structures.
|
|
|
|
But it's not a traditional Forth. RETRO draws influences from
|
|
many sources and takes a unique approach to the language.
|
|
|
|
RETRO has a large vocabulary of words. Keeping a copy of the
|
|
Glossary on hand is highly recommended as you learn to use RETRO.
|
|
|
|
This book will hopefully help you develop a better understanding
|
|
of RETRO and how it works.
|
|
|
|
# Building RETRO on BSD
|
|
|
|
## Requirements
|
|
|
|
- c compiler (tested: clang, tcc, gcc)
|
|
- make
|
|
- standard unix shell
|
|
|
|
## Process
|
|
|
|
Run `make`.
|
|
|
|
This will build the toolchain and then the main `retro`
|
|
executable.
|
|
|
|
## Executables
|
|
|
|
In the `bin/` directory:
|
|
|
|
retro
|
|
retro-unu
|
|
retro-muri
|
|
retro-extend
|
|
retro-embedimage
|
|
|
|
|
|
# Building RETRO on Linux
|
|
|
|
## Requirements
|
|
|
|
- c compiler (tested: clang, tcc, gcc)
|
|
- make
|
|
- standard unix shell
|
|
|
|
## Process
|
|
|
|
Run `make -f Makefile.linux`.
|
|
|
|
This will build the toolchain and then the main `retro`
|
|
executable.
|
|
|
|
## Executables
|
|
|
|
In the `bin/` directory:
|
|
|
|
retro
|
|
retro-unu
|
|
retro-muri
|
|
retro-extend
|
|
retro-embedimage
|
|
|
|
|
|
# Building RETRO on macOS
|
|
|
|
## Requirements
|
|
|
|
- c compiler (tested: clang, tcc, gcc)
|
|
- make
|
|
- standard unix shell
|
|
|
|
## Process
|
|
|
|
Run `make`.
|
|
|
|
This will build the toolchain and then the main `retro`
|
|
executable.
|
|
|
|
## Executables
|
|
|
|
In the `bin/` directory:
|
|
|
|
retro
|
|
retro-unu
|
|
retro-muri
|
|
retro-extend
|
|
retro-embedimage
|
|
|
|
|
|
# Building RETRO on Windows
|
|
|
|
## C#: retro.cs
|
|
|
|
This is an implementation of `retro-repl` in C#. As with
|
|
`retro-repl` it requires the `ngaImage` in the current
|
|
directory when starting.
|
|
|
|
Building:
|
|
|
|
csc retro.cs
|
|
|
|
You'll need to make sure your path has the CSC.EXE in it,
|
|
or provide a full path to it. Something like this should
|
|
reveal the path to use:
|
|
|
|
dir /s %WINDIR%\CSC.EXE
|
|
|
|
I've only tested building this using Microsoft's .NET tools.
|
|
It should also build and run under Mono.
|
|
|
|
|
|
# Starting RETRO
|
|
|
|
RETRO can be run for scripting or interactive use. To start it
|
|
interactively, run: `retro -i` or `retro -c`.
|
|
|
|
For a summary of the full command line arguments available:
|
|
|
|
Scripting Usage:
|
|
|
|
retro filename [script arguments...]
|
|
|
|
Interactive Usage:
|
|
|
|
retro [-h] [-i] [-c] [-s] [-f filename] [-t]
|
|
|
|
-h Display this help text
|
|
-i Interactive mode (line buffered)
|
|
-c Interactive mode (character buffered)
|
|
-s Suppress the 'ok' prompt and keyboard
|
|
echo in interactive mode
|
|
-f filename Run the contents of the specified file
|
|
-t Run tests (in ``` blocks) in any loaded files
|
|
|
|
|
|
# Starting RETRO
|
|
|
|
RETRO can be run for scripting or interactive use. To start it
|
|
interactively, run: `retro -i` or `retro -c`.
|
|
|
|
For a summary of the full command line arguments available:
|
|
|
|
Scripting Usage:
|
|
|
|
retro filename [script arguments...]
|
|
|
|
Interactive Usage:
|
|
|
|
retro [-h] [-i] [-c] [-s] [-f filename] [-t]
|
|
|
|
-h Display this help text
|
|
-i Interactive mode (line buffered)
|
|
-c Interactive mode (character buffered)
|
|
-s Suppress the 'ok' prompt and keyboard
|
|
echo in interactive mode
|
|
-f filename Run the contents of the specified file
|
|
-t Run tests (in ``` blocks) in any loaded files
|
|
|
|
|
|
# Starting RETRO
|
|
|
|
RETRO can be run for scripting or interactive use. To start it
|
|
interactively, run: `retro -i` or `retro -c`.
|
|
|
|
For a summary of the full command line arguments available:
|
|
|
|
Scripting Usage:
|
|
|
|
retro filename [script arguments...]
|
|
|
|
Interactive Usage:
|
|
|
|
retro [-h] [-i] [-c] [-s] [-f filename] [-t]
|
|
|
|
-h Display this help text
|
|
-i Interactive mode (line buffered)
|
|
-c Interactive mode (character buffered)
|
|
-s Suppress the 'ok' prompt and keyboard
|
|
echo in interactive mode
|
|
-f filename Run the contents of the specified file
|
|
-t Run tests (in ``` blocks) in any loaded files
|
|
|
|
|
|
|
|
# Basic Interactions
|
|
|
|
Start RETRO in interactive mode:
|
|
|
|
```
|
|
retro -i
|
|
```
|
|
|
|
You should see something similar to this:
|
|
|
|
RETRO 12 (rx-2019.6)
|
|
8388608 MAX, TIB @ 1025, Heap @ 9374
|
|
|
|
Ok
|
|
|
|
At this point you are at the *listener*, which reads and
|
|
processes your input. You are now set to begin exploring
|
|
RETRO.
|
|
|
|
To exit, run `bye`:
|
|
|
|
```
|
|
bye
|
|
```
|
|
|
|
# Syntax
|
|
|
|
RETRO has more syntax than a traditional Forth due to ideas
|
|
borrowed from ColorForth and some design decisions. This has
|
|
some useful traits, and helps to make the language more
|
|
consistent.
|
|
|
|
## Tokens
|
|
|
|
Input is divided into a series of whitespace delimited tokens.
|
|
Each of these is then processed individually. There are no
|
|
parsing words in RETRO.
|
|
|
|
Tokens may have a single character *prefix*, which RETRO will
|
|
use to decide how to process the token.
|
|
|
|
## Prefixes
|
|
|
|
Prefixes are single characters added to the start of a token
|
|
to guide the compiler. The use of these is a major way in
|
|
which RETRO differs from traditional Forth.
|
|
|
|
When a token is passed to `interpret`, RETRO first takes the
|
|
intitial character and looks to see if there is a word that
|
|
matches this. If so, it will pass the rest of the token to
|
|
that word to handle.
|
|
|
|
In a traditional Forth, the interpret process is something
|
|
like:
|
|
|
|
get token
|
|
is token in the dictionary?
|
|
yes:
|
|
is it immediate?
|
|
yes: call the word.
|
|
no: are we interpreting?
|
|
yes: call the word
|
|
no: compile a call to the word
|
|
no:
|
|
is it a number?
|
|
yes: are we interpreting?
|
|
yes: push the number to the stack
|
|
no: compile the number as a literal
|
|
no: report an error ("not found")
|
|
|
|
In RETRO, the interpret process is basically:
|
|
|
|
get token
|
|
does the first character match a `prefix:` word?
|
|
yes: pass the token to the prefix handler
|
|
no: is token a word in the dictionary?
|
|
yes: push the XT to the stack and call the
|
|
class handler
|
|
no: report an error ("not found")
|
|
|
|
All of the actual logic for how to deal with tokens is moved
|
|
to the individual prefix handlers, and the logic for handling
|
|
words is moved to word class handlers.
|
|
|
|
This means that prefixes are used for a lot of things. Numbers?
|
|
Handled by a `#` prefix. Strings? Use the `'` prefix. Comments?
|
|
Use `(`. Making a new word? Use the `:` prefix.
|
|
|
|
The major prefixes are:
|
|
|
|
| Prefix | Used For |
|
|
| ------ | ----------------------------- |
|
|
| @ | Fetch from variable |
|
|
| ! | Store into variable |
|
|
| & | Pointer to named item |
|
|
| # | Numbers |
|
|
| $ | ASCII characters |
|
|
| ' | Strings |
|
|
| ( | Comments |
|
|
| : | Define a word |
|
|
|
|
The individual prefixes will be covered in more detail in the
|
|
later chapters on working with different data types.
|
|
|
|
## Word Classes
|
|
|
|
Word classes are words which take a pointer and do something
|
|
with it.
|
|
|
|
# A Quick Tutorial
|
|
|
|
Programming in RETRO is all about creating words to solve
|
|
the problem at hand. Words operate on data, which can be
|
|
kept in memory or on the stack.
|
|
|
|
Let's look at this by solving a small problem: writing a
|
|
word to determine if a string is a palindrome.
|
|
|
|
A palindrome is a phrase which reads the same backward
|
|
and forward.
|
|
|
|
We first need a string to look at. Starting with something
|
|
easy:
|
|
|
|
```
|
|
'anna
|
|
```
|
|
|
|
Looking in the Glossary, there is a `s:reverse` word for
|
|
reversing a string. We can find `dup` to copy a value, and
|
|
`s:eq?` to compare two strings. So testing:
|
|
|
|
```
|
|
'anna dup s:reverse s:eq?
|
|
```
|
|
|
|
This yields -1 (`TRUE`) as expected. So we can easily
|
|
name it:
|
|
|
|
```
|
|
:palindrome dup s:reverse s:eq? ;
|
|
```
|
|
|
|
Naming uses the `:` prefix to add a new word to the dictionary.
|
|
The words that make up the definition are then placed, with a
|
|
final word (`;`) ending the definition. We can then use this:
|
|
|
|
```
|
|
'anna palindrome?
|
|
```
|
|
|
|
Once defined there is no difference between our new word and
|
|
any of the words already provided by the RETRO system.
|
|
|
|
# Using The Glossary
|
|
|
|
The Glossary is a valuable resource. It provides information
|
|
on the RETRO words.
|
|
|
|
## Example Entry
|
|
|
|
f:+
|
|
|
|
Data: -
|
|
Addr: -
|
|
Float: FF-F
|
|
|
|
Add two floating point numbers, returning the result.
|
|
|
|
Class: class:word | Namespace: f | Interface Layer: rre
|
|
|
|
Example #1:
|
|
|
|
.3.1 .22 f:+
|
|
|
|
## Reading The Entry
|
|
|
|
An entry starts with the word name.
|
|
|
|
This is followed by the stack effect for each stack. All RETRO
|
|
systems have Data and Address stacks, some also include a
|
|
floating point stack).
|
|
|
|
The stack effect diagrams are followed by a short description
|
|
of the word.
|
|
|
|
After the description is a line providing some useful data. This
|
|
includes the class handler, namespace prefix, and the interface
|
|
layer that provides the word.
|
|
|
|
Words in all systems will be listed as `all`. Some words (like
|
|
the `pb:` words) are only on specific systems like iOS. These
|
|
can be identified by looking at the interface layer field.
|
|
|
|
At the end of the entry may be an example or two.
|
|
|
|
## Access Online
|
|
|
|
The latest Glossary can be browsed at http://forthworks.com:9999
|
|
or gopher://forthworks.com:9999
|
|
|
|
# Programming Techniques
|
|
|
|
The upcoming chapters provide helpful information on using RETRO
|
|
with different types of data and hints on how to solve problems
|
|
in a way consistent with the RETRO system.
|
|
|
|
# Unu: Simple, Literate Source Files
|
|
|
|
RETRO is written in a literate style. Most of the sources
|
|
are in a format called Unu. This allows easy mixing of
|
|
commentary and code blocks, making it simple to document
|
|
the code.
|
|
|
|
As an example,
|
|
|
|
# Determine The Average Word Name Length
|
|
|
|
To determine the average length of a word name two values
|
|
are needed. First, the total length of all names in the
|
|
Dictionary:
|
|
|
|
~~~
|
|
#0 [ d:name s:length + ] d:for-each
|
|
~~~
|
|
|
|
And then the number of words in the Dictionary:
|
|
|
|
~~~
|
|
#0 [ drop n:inc ] d:for-each
|
|
~~~
|
|
|
|
With these, a simple division is all that's left.
|
|
|
|
~~~
|
|
/
|
|
~~~
|
|
|
|
Finally, display the results:
|
|
|
|
|
|
~~~
|
|
'Average_name_length:_%n\n s:format s:put
|
|
~~~
|
|
|
|
This illustrates the format. Only code in the fenced blocks
|
|
(between \~~~ pairs) get extracted and run.
|
|
|
|
(Note: this only applies to *source files*; fences are not used
|
|
when entering code interactively).
|
|
|
|
# Naming Conventions
|
|
|
|
Word names in RETRO generally follow the following conventions.
|
|
|
|
## Case
|
|
|
|
Word names are lowercase, with a dash (-) for compound names.
|
|
Variables use TitleCase, with no dash between compound names.
|
|
Constants are UPPERCASE, with a dash (-) for compound names.
|
|
|
|
## Namespaces
|
|
|
|
Words are grouped into broad namespaces by attaching a short
|
|
prefix string to the start of a name.
|
|
|
|
The common namespaces are:
|
|
|
|
| Prefix | Contains |
|
|
| ------- | ------------------------------------------------------ |
|
|
| array: | Words operating on simple arrays |
|
|
| ASCII: | ASCII character constants for control characters |
|
|
| buffer: | Words for operating on a simple linear LIFO buffer |
|
|
| c: | Words for operating on ASCII character data |
|
|
| class: | Contains class handlers for words |
|
|
| d: | Words operating on the Dictionary |
|
|
| err: | Words for handling errors |
|
|
| io: | General I/O words |
|
|
| n: | Words operating on numeric data |
|
|
| prefix: | Contains prefix handlers |
|
|
| s: | Words operating on string data |
|
|
| v: | Words operating on variables |
|
|
| file: | File I/O words |
|
|
| f: | Floating Point words |
|
|
| gopher: | Gopher protocol words |
|
|
| unix: | Unix system call words |
|
|
|
|
# Stack Diagrams
|
|
|
|
Most words in RETRO have a stack comment. These look like:
|
|
|
|
(-)
|
|
(nn-n)
|
|
|
|
As with all comments, a stack comment begins with `(` and
|
|
should end with a `)`. There are two parts to the comment.
|
|
On the left side of the `-` is what the word *consumes*. On
|
|
the right is what it *leaves*.
|
|
|
|
RETRO uses a short notation, with one character per value
|
|
taken or left. In general, the following symbols represent
|
|
certain types of values.
|
|
|
|
b, n, m, o, x, y, z are generic numeric values
|
|
s represents a string
|
|
v represents a variable
|
|
p, a represent pointers
|
|
q represents a quotation
|
|
d represents a dictionary header
|
|
f represents a `TRUE` or `FALSE` flag.
|
|
|
|
In the case of something like `(xyz-m)`, RETRO expects z to be
|
|
on the top of the stack, with y below it and x below the y
|
|
value. And after execution, a single value (m) will be left on
|
|
the stack.
|
|
|
|
Words with no stack effect have a comment of (-)
|
|
|
|
# Word Classes
|
|
|
|
Word classes are one of the two elements at the heart of
|
|
RETRO's interpreter.
|
|
|
|
There are different types of words in a Forth system. At a
|
|
minimum there are data words, regular words, and immediate
|
|
words. There are numerous approaches to dealing with this.
|
|
|
|
In RETRO I define special words which receive a pointer and
|
|
decide how to deal with it. These are grouped into a `class:`
|
|
namespace.
|
|
|
|
## How It Works
|
|
|
|
When a word is found in the dictionary, RETRO will push a
|
|
pointer to the definition (the `d:xt` field) to the stack
|
|
and then call the word specified by the `d:class` field.
|
|
|
|
The word called is responsible for processing the pointer
|
|
passed to it.
|
|
|
|
As a simple case, let's look at `immediate` words. These are
|
|
words which will always be called when encountered. A common
|
|
strategy is to have an immediacy bit which the interpreter
|
|
will look at, but RETRO uses a class for this. The class is
|
|
defined:
|
|
|
|
```
|
|
:class:immediate (a-) call ;
|
|
```
|
|
|
|
Or a normal word. These should be called at interpret time
|
|
or compiled into definitions. The handler for this can look
|
|
like:
|
|
|
|
```
|
|
:class:word (a-) compiling? [ compile:call ] [ call ] choose ;
|
|
```
|
|
|
|
## Using Classes
|
|
|
|
The ability to add new classes is useful. If I wanted to add
|
|
a category of word that preserves an input value, I could do
|
|
it with a class:
|
|
|
|
```
|
|
:class:duplicating (a-)
|
|
compiling? [ &dup compile:call ] [ &dup dip ] choose
|
|
class:word ;
|
|
|
|
:duplicating &class:duplicating reclass ;
|
|
|
|
:. n:put nl ; duplicating
|
|
#100 . . .
|
|
```
|
|
|
|
|
|
|
|
|
|
# Working With a Buffer
|
|
|
|
RETRO provides words for operating on a linear memory area.
|
|
This can be useful in building strings or custom data
|
|
structures.
|
|
|
|
## Namespace
|
|
|
|
Words operating on the buffer are kept in the `buffer`
|
|
namespace.
|
|
|
|
# Working With Characters
|
|
|
|
RETRO provides words for working with ASCII characters.
|
|
|
|
## Prefix
|
|
|
|
Character constants are returned using the `$` prefix.
|
|
|
|
# Working With The Dictionary
|
|
|
|
The Dictionary is a linked list containing the dictionary
|
|
headers.
|
|
|
|
## Namespace
|
|
|
|
Words operating on the dictionary are in the `d:` namespace.
|
|
|
|
## Variables
|
|
|
|
`Dictionary` is a variable holding a pointer to the most recent
|
|
header.
|
|
|
|
## Header Structure
|
|
|
|
Each entry follows the following structure:
|
|
|
|
Offset Contains
|
|
------ ---------------------------
|
|
0000 Link to Prior Header
|
|
0001 Link to XT
|
|
0002 Link to Class Handler
|
|
0003+ Word name (null terminated)
|
|
|
|
RETRO provides words for accessing the fields in a portable
|
|
manner. It's recommended to use these to allow for future
|
|
revision of the header structure.
|
|
|
|
## Accessing Fields
|
|
|
|
Given a pointer to a header, you can use `d:xt`, `d:class`,
|
|
and `d:name` to access the address of each specific field.
|
|
There is no `d:link`, as the link will always be the first
|
|
field.
|
|
|
|
## Shortcuts For The Latest Header
|
|
|
|
RETRO provides several words for operating on the most recent
|
|
header.
|
|
|
|
`d:last` returns a pointer to the latest header. `d:last<xt>`
|
|
will give the contents of the `d:xt` field for the latest
|
|
header. There are also `d:last<class>` and `d:last<name>`.
|
|
|
|
## Adding Headers
|
|
|
|
Two words exist for making new headers. The easy one is
|
|
`d:create`. This takes a string for the name and makes a
|
|
new header with the class set to `class:data` and the XT
|
|
field pointing to `here`.
|
|
|
|
Example:
|
|
|
|
```
|
|
'Base d:create
|
|
```
|
|
|
|
The other is `d:add-header`. This takes a string, a pointer
|
|
to the class handler, and a pointer for the XT field and
|
|
builds a new header using these.
|
|
|
|
Example:
|
|
|
|
```
|
|
'Base &class:data #10000 d:add-header
|
|
```
|
|
|
|
## Searching
|
|
|
|
RETRO provides two words for searching the dictionary.
|
|
|
|
`d:lookup` takes a string and tries to find it in the
|
|
dictionary. It will return a pointer to the dictionary header
|
|
or a value of zero if the word was not found.
|
|
|
|
`d:lookup-xt` takes a pointer and will return the dictionary
|
|
header that has this as the `d:xt` field, or zero if no match
|
|
is found.
|
|
|
|
## Iteration
|
|
|
|
You can use the `d:for-each` combinator to iterate over all
|
|
entries in the dictionary. For instance, to display the names
|
|
of all words:
|
|
|
|
```
|
|
[ d:name s:put sp ] d:for-each
|
|
```
|
|
|
|
For each entry, this combinator will push a pointer to the
|
|
entry to the stack and call the quotation.
|
|
|
|
## Listing Words
|
|
|
|
Most Forth systems provide WORDS for listing the names of all
|
|
words in the dictionary. RETRO does as well, but this is named
|
|
`d:words`.
|
|
|
|
This isn't super useful as looking through several hundred
|
|
names is annoying. RETRO also provides `d:words-with` to help
|
|
in filtering the results.
|
|
|
|
Example:
|
|
|
|
```
|
|
'class: d:words-with
|
|
```
|
|
|
|
|
|
# Working With Floating Point
|
|
|
|
Some RETRO systems include support for floating point numbers.
|
|
When present, this is built over the system `libm` using the
|
|
C `double` type.
|
|
|
|
Floating point values are typically 64 bit IEEE 754 double
|
|
precision (1 bit for the sign, 11 bits for the exponent, and
|
|
the remaining 52 bits for the value), i.e. 15 decimal digits
|
|
of precision.
|
|
|
|
## Prefix
|
|
|
|
Floating point numbers start with a `.`
|
|
|
|
Examples:
|
|
|
|
Token Value
|
|
.1 1.0
|
|
.0.5 0.5
|
|
.-.4 -0.4
|
|
.1.3 1.3
|
|
|
|
# Working With Numbers
|
|
|
|
Numbers in RETRO are signed, 32 bit integers with a range of
|
|
-2,147,483,648 to 2,147,483,647.
|
|
|
|
## Token Prefix
|
|
|
|
All numbers start with a `#` prefix.
|
|
|
|
## Namespace
|
|
|
|
Most words operating on numbers are in the `n:` namespace.
|
|
|
|
|
|
|
|
# Working With Pointers
|
|
|
|
## Prefix
|
|
|
|
Pointers are returned by the `&` prefix.
|
|
|
|
## Examples
|
|
|
|
```
|
|
'Base var
|
|
&Base fetch
|
|
#10 &Base store
|
|
|
|
#10 &n:inc call
|
|
```
|
|
|
|
## Notes
|
|
|
|
The use of `&` to get a pointer to a data structure (with a
|
|
word class of `class:data`) is not required. I like to use it
|
|
anyway as it makes my intent a little clearer.
|
|
|
|
Pointers are useful with combinators. Consider:
|
|
|
|
```
|
|
:abs dup n:negative? [ n:negate ] if ;
|
|
```
|
|
|
|
Since the target quote body is a single word, it is more
|
|
efficient to use a pointer instead:
|
|
|
|
```
|
|
:abs dup n:negative? &n:negate if ;
|
|
```
|
|
|
|
The advantages are speed (saves a level of call/return by
|
|
avoiding the quotation) and size (for the same reason).
|
|
This may be less readable though, so consider the balance
|
|
of performance to readability when using this approach.
|
|
|
|
# Working With Strings
|
|
|
|
Strings in RETRO are NULL terminated sequences of values
|
|
representing characters. Being NULL terminated, they can't
|
|
contain a NULL (ASCII 0).
|
|
|
|
The character words in RETRO are built around ASCII, but
|
|
strings can contain UTF8 encoded data if the host platform
|
|
allows. Words like `s:length` will return the number of bytes,
|
|
not the number of logical characters in this case.
|
|
|
|
## Prefix
|
|
|
|
Strings begin with a single `'`.
|
|
|
|
```
|
|
'Hello
|
|
'This_is_a_string
|
|
'This_is_a_much_longer_string_12345_67890_!!!
|
|
```
|
|
|
|
RETRO will replace spaces with underscores. If you need both
|
|
spaces and underscores in a string, escape the underscores and
|
|
use `s:format`:
|
|
|
|
```
|
|
'This_has_spaces_and_under\_scored_words. s:format
|
|
```
|
|
|
|
## Lifetime
|
|
|
|
At the interpreter, strings get allocated in a rotating buffer.
|
|
This is used by the words operating on strings, so if you need
|
|
to keep them around, use `s:keep` or `s:copy` to move them to
|
|
more permanent storage.
|
|
|
|
In a definition, the string is compiled inline and so is in
|
|
permanent memory.
|
|
|
|
You can manually manage the string lifetime by using `s:keep`
|
|
to place it into permanent memory or `s:temp` to copy it to
|
|
the rotating buffer.
|
|
|
|
## Mutability
|
|
|
|
Strings are mutable. If you need to ensure that a string is
|
|
not altered, make a copy before operating on it or see the
|
|
individual glossary entries for notes on words that may do
|
|
this automatically.
|
|
|
|
## Searching
|
|
|
|
RETRO provides two words for searching within a string.
|
|
|
|
`s:contains-char?`
|
|
`s:contains-string?`
|
|
`s:index-of`
|
|
`s:index-of-string`
|
|
|
|
## Comparisons
|
|
|
|
`s:eq?`
|
|
`s:case`
|
|
|
|
## Extraction
|
|
|
|
`s:left`
|
|
`s:right`
|
|
`s:substr`
|
|
|
|
## Joining
|
|
|
|
`s:append`
|
|
`s:prepend`
|
|
|
|
## Tokenization
|
|
|
|
`s:tokenize`
|
|
`s:tokenize-on-string`
|
|
`s:split`
|
|
`s:split-on-string`
|
|
|
|
## Conversions
|
|
|
|
`s:to-lower`
|
|
`s:to-upper`
|
|
`s:to-number`
|
|
|
|
## Cleanup
|
|
|
|
`s:chop`
|
|
`s:trim`
|
|
`s:trim-left`
|
|
`s:trim-right`
|
|
|
|
## Combinators
|
|
|
|
`s:for-each`
|
|
`s:filter`
|
|
`s:map`
|
|
|
|
## Other
|
|
|
|
`s:evaluate`
|
|
`s:copy`
|
|
`s:reverse`
|
|
`s:hash`
|
|
`s:length`
|
|
`s:replace`
|
|
`s:format`
|
|
`s:empty`
|
|
|
|
|
|
# The Return Stack
|
|
|
|
RETRO has two stacks. The primary one is used to pass data
|
|
beween words. The second one primarily holds return addresses.
|
|
|
|
Each time a word is called, the next address is pushed to
|
|
the return stack.
|
|
|
|
# Working With Assembly Language
|
|
|
|
RETRO runs on a virtual machine called Nga. It provides a
|
|
standard assembler for this called *Muri*.
|
|
|
|
Muri is a simple, multipass model that's not fancy, but
|
|
suffices for RETRO's needs.
|
|
|
|
## Assembling A Standalone File
|
|
|
|
A small example (*test.muri*)
|
|
|
|
~~~
|
|
i liju....
|
|
r main
|
|
: c:put
|
|
i liiire..
|
|
i 0
|
|
: main
|
|
i lilica..
|
|
d 97
|
|
i liju....
|
|
r main
|
|
~~~
|
|
|
|
Assembling it:
|
|
|
|
retro-muri test.muri
|
|
|
|
So breaking down: Muri extracts the assembly code blocks to
|
|
assemble, then proceeds to do the assembly. Each source line
|
|
starts with a directive, followed by a space, and then ending
|
|
with a value.
|
|
|
|
The directives are:
|
|
|
|
: value is a label
|
|
i value is an instruction bundle
|
|
d value is a numeric value
|
|
r value is a reference
|
|
s value is a string to inline
|
|
|
|
Instructions for Nga are provided as bundles. Each memory
|
|
location can store up to four instructions. And each instruction
|
|
gets a two character identifier.
|
|
|
|
From the list of instructions:
|
|
|
|
0 nop 5 push 10 ret 15 fetch 20 div 25 zret
|
|
1 lit 6 pop 11 eq 16 store 21 and 26 end
|
|
2 dup 7 jump 12 neq 17 add 22 or 27 ienum
|
|
3 drop 8 call 13 lt 18 sub 23 xor 28 iquery
|
|
4 swap 9 ccall 14 gt 19 mul 24 shift 29 iinvoke
|
|
|
|
This reduces to:
|
|
|
|
0 .. 5 pu 10 re 15 fe 20 di 25 zr
|
|
1 li 6 po 11 eq 16 st 21 an 26 en
|
|
2 du 7 ju 12 ne 17 ad 22 or 27 ie
|
|
3 dr 8 ca 13 lt 18 su 23 xo 28 iq
|
|
4 sw 9 cc 14 gt 19 mu 24 sh 29 ii
|
|
|
|
Most are just the first two letters of the instruction name. I
|
|
use `..` instead of `no` for `NOP`, and the first letter of
|
|
each I/O instruction name. So a bundle may look like:
|
|
|
|
dumure..
|
|
|
|
(This would correspond to `dup multiply return nop`).
|
|
|
|
## Runtime Assembler
|
|
|
|
RETRO also has a runtime variation of Muri that can be used
|
|
when you need to generate more optimal code. So one can write:
|
|
|
|
:n:square dup * ;
|
|
|
|
Or:
|
|
|
|
:n:square as{ 'dumure.. i }as ;
|
|
|
|
The second one will be faster, as the entire definition is one
|
|
bundle, which reduces memory reads and decoding by 2/3.
|
|
|
|
Doing this is less readable, so I only recommend doing so after
|
|
you have finalized working RETRO level code and determined the
|
|
best places to optimize.
|
|
|
|
The runtime assembler has the following directives:
|
|
|
|
i value is an instruction bundle
|
|
d value is a numeric value
|
|
r value is a reference
|
|
|
|
Additionally, in the runtime assembler, these are reversed:
|
|
|
|
'dudumu.. i
|
|
|
|
Instead of:
|
|
|
|
i dudumu..
|
|
|
|
# Internals
|
|
|
|
The next few chapters dive into RETRO's architecture. If you
|
|
seek to implement a port to a new platform or to extend the
|
|
I/O functionality you'll find helpful information here.
|
|
|
|
# Internals: Nga Virtual Machine
|
|
|
|
## Overview
|
|
|
|
At the heart of RETRO is a simple MISC (minimal instruction
|
|
set computer) processor for a dual stack architecture.
|
|
|
|
This is a very simple and straightforward system. There are
|
|
30 instructions. The memory is a linear array of signed 32
|
|
bit values. And there are two stacks: one for data and one
|
|
for return addresses.
|
|
|
|
## Instrution Table
|
|
|
|
Column:
|
|
|
|
0 - opcode value
|
|
1 - Muri assembly name
|
|
2 - Full name
|
|
3 - Data Stack Usage
|
|
4 - Address Stack Usage
|
|
|
|
+--------------------------------------------------+
|
|
| 0 .. nop - - |
|
|
| 1 li lit -n - |
|
|
| 2 du dup n-nn - |
|
|
| 3 dr drop n- - |
|
|
| 4 sw swap xy-yx - |
|
|
| 5 pu push n- -n |
|
|
| 6 po pop -n n- |
|
|
| 7 ju jump a- - |
|
|
| 8 ca call a- -A |
|
|
| 9 cc conditional call af- -A |
|
|
| 10 re return - A- |
|
|
| 11 eq equality xy-f - |
|
|
| 12 ne inequality xy-f - |
|
|
| 13 lt less than xy-f - |
|
|
| 14 gt greater than xy-f - |
|
|
| 15 fe fetch a-n - |
|
|
| 16 st store na- - |
|
|
| 17 ad addition xy-n - |
|
|
| 18 su subtraction xy-n - |
|
|
| 19 mu multiplication xy-n - |
|
|
| 20 di divide & remainder xy-rq - |
|
|
| 21 an bitwise and xy-n - |
|
|
| 22 or bitwise or xy-n - |
|
|
| 23 xo bitwise xor xy-n - |
|
|
| 24 sh shift xy-n - |
|
|
| 25 zr zero return n-n | n- - |
|
|
| 26 en end - - |
|
|
| 27 ie i/o enumerate -n - |
|
|
| 28 iq i/o query n-xy - |
|
|
| 29 ii i/o invoke ...n- - |
|
|
| |
|
|
| Each `li` will push the value in the following |
|
|
| cell to the data stack. |
|
|
+--------------------------------------------------+
|
|
| li du mu .. |
|
|
| i lidumu.. 00000001:00000010:00010011:00000000 |
|
|
| data for li |
|
|
| d 2 00000000:00000000:00000000:00000010 |
|
|
| |
|
|
| Assembler Directives Instruction Bundles |
|
|
| ======================== ==================== |
|
|
| : label Combine instruction |
|
|
| i bundle names in groups of 4 |
|
|
| d numeric-data |
|
|
| r ref-to-address-by-name Use only .. after |
|
|
| s null-terminated string ju, ca, cc, re, zr |
|
|
+--------------------------------------------------+
|
|
|
|
## Misc
|
|
|
|
There are 810,000 possible combinations of instructions. Only
|
|
73 are used in the implementation of RETRO.
|
|
|
|
# Internals: I/O
|
|
|
|
RETRO provides three words for interacting with I/O. These are:
|
|
|
|
io:enumerate returns the number of attached devices
|
|
io:query returns information about a device
|
|
io:invoke invokes an interaction with a device
|
|
|
|
As an example, with an implementation providing an output source,
|
|
a block storage system, and keyboard:
|
|
|
|
io:enumerate will return `3` since there are three
|
|
i/o devices
|
|
#0 io:query will return 0 0, since the first device
|
|
is a screen (type 0) with a version of 0
|
|
#1 io:query will return 1 3, since the second device is
|
|
block storage (type 3), with a version of 1
|
|
#2 io:query will return 0 1, since the last device is a
|
|
keyboard (type 1), with a version of 0
|
|
|
|
In this case, some interactions can be defined:
|
|
|
|
:c:put #0 io:invoke ;
|
|
:c:get #2 io:invoke ;
|
|
|
|
Setup the stack, push the device ID, and then use `io:invoke`
|
|
to invoke the interaction.
|
|
|
|
A RETRO system requires one I/O device (a generic output for a
|
|
single character). This must be the first device, and must have
|
|
a device ID of 0.
|
|
|
|
All other devices are optional and can be specified in any
|
|
order.
|
|
|
|
|
|
# Internals: Interface Layers
|
|
|
|
Nga provides a virtual processor and an extensible way of adding
|
|
I/O devices, but does not provide any I/O itself. Adding I/O is
|
|
the responsability of the *interface layer*.
|
|
|
|
An interface layer will wrap Nga, providing at least one I/O
|
|
device (a generic output target), and a means of interacting
|
|
with the *retro image*.
|
|
|
|
It's expected that this layer will be host specific, adding any
|
|
system interactions that are needed via the I/O instructions.
|
|
The image will typically be extended with words to use these.
|
|
|
|
|
|
|
|
# Internals: The Retro Image
|
|
|
|
The actual RETRO language is stored as a memory image for Nga.
|
|
|
|
## Layout
|
|
|
|
Assuming an Nga built with 524287 cells of memory:
|
|
|
|
| RANGE | CONTAINS |
|
|
| --------------- | ---------------------------- |
|
|
| 0 - 1024 | rx kernel |
|
|
| 1025 - 1535 | token input buffer |
|
|
| 1536 + | start of heap space |
|
|
| ............... | free memory for your use |
|
|
| 506879 | buffer for string evaluate |
|
|
| 507904 | temporary strings (32 * 512) |
|
|
| 524287 | end of memory |
|
|
|
|
The buffers at the end of memory will resize when specific
|
|
variables related to them are altered.
|
|
|
|
|
|
|
|
|