check in some short articles on design and reflections on historic retro systems
FossilOrigin-Name: eb357b5905393a162430aaaa4f8d4c3803a060ee6715e899efa3d3973a6bdc1e
This commit is contained in:
parent
8e45d7e6e0
commit
ab027a48a7
6 changed files with 463 additions and 0 deletions
77
doc/papers/Notes_on_Metacompilation_and_Assembly.txt
Normal file
77
doc/papers/Notes_on_Metacompilation_and_Assembly.txt
Normal file
|
@ -0,0 +1,77 @@
|
|||
Retro 10 and 11 were written in themselves using a metacompiler.
|
||||
I had been fascinated by this idea for a long time and was able
|
||||
to explore it heavily. While I still find it to be a good idea,
|
||||
the way I ended up doing it was problematic.
|
||||
|
||||
The biggest issue I faced was that I wanted to do this in one
|
||||
step, where loading the Retro source would create a new image
|
||||
in place of the old one, switch to the new one, and then load
|
||||
the higher level parts of the language over this. In retrospect,
|
||||
this was a really bad idea.
|
||||
|
||||
My earlier design for Retro was very flexible. I allowed almost
|
||||
everything to be swapped out or extended at any time. This made
|
||||
it extremely easy to customize the language and environment, but
|
||||
made it crucial to keep track of what was in memory and what had
|
||||
been patched so that the metacompiler wouldn't refer to anything
|
||||
in the old image during the relocation and control change. It
|
||||
was far too easy to make a mistake, discover that elements of
|
||||
the new image were broken, and the have to go and revert many
|
||||
changes to try to figure out what went wrong.
|
||||
|
||||
This was also complicated by the fact that I built new images
|
||||
as I worked, and, while a new image could be built from the last
|
||||
built one, it wasn't always possible to build a new image from
|
||||
the prior release version. (Actually, it was often worse - I
|
||||
failed to check in every change as I went, so often even the
|
||||
prior commits couldn't rebuild the latest images).
|
||||
|
||||
For Retro 12 I wanted to avoid this problem, so I decided to go
|
||||
back to writing the kernel ("Rx") in assembly. I actually wrote
|
||||
a Machine Forth dialect to generate the initial assembly, before
|
||||
eventually hand tuning the final results to its current state.
|
||||
|
||||
I could (and likely will eventually) write the assembler in
|
||||
Retro, but the current one is in C, and is built as part of the
|
||||
standard toolchain.
|
||||
|
||||
My VM actually has two assemblers. The older one is Naje. This
|
||||
was intended to be fairly friendly to work with, and handles
|
||||
many of the details of packing instructions for the user. Here
|
||||
is an example of a small program in it:
|
||||
|
||||
:square
|
||||
dup
|
||||
mul
|
||||
ret
|
||||
:main
|
||||
lit 35
|
||||
lit &square
|
||||
call
|
||||
lit &square
|
||||
call
|
||||
end
|
||||
|
||||
The other assembler is Muri. This is a far more minimalistic
|
||||
assembler, but I've actually grown to prefer it. The above
|
||||
example in Muri would become:
|
||||
|
||||
i liju....
|
||||
r main
|
||||
: square
|
||||
i dumure..
|
||||
: main
|
||||
i lilica..
|
||||
d 35
|
||||
r square
|
||||
i en......
|
||||
|
||||
In Muri, each instruction is reduced to two characters, and the
|
||||
bundlings are listed as part of an instruction bundle (lines
|
||||
starting with `i`). This is less readable if you aren't very
|
||||
familiar with Nga's assembly and packing rules, but allows a
|
||||
very quick, efficient way of writing assembly for those who are.
|
||||
|
||||
I eventually rewrote the kernel in the Muri style as it's what
|
||||
I prefer, and since there's not much need to make changes in it.
|
||||
|
45
doc/papers/Notes_on_Prefixes_as_a_Language_Element.txt
Normal file
45
doc/papers/Notes_on_Prefixes_as_a_Language_Element.txt
Normal file
|
@ -0,0 +1,45 @@
|
|||
A big change in Retro 12 was the elimination of the traditional
|
||||
parser from the language. This was a sacrifice due to the lack
|
||||
of an I/O model. Retro has no way to know *how* input is given
|
||||
to the `interpret` word, or whether anything else will ever be
|
||||
passed into it.
|
||||
|
||||
And so `interpret` operates only on the current token. The core
|
||||
language does not track what came before or attempt to guess at
|
||||
what might come in the future.
|
||||
|
||||
This leads into the prefixes. Retro 11 had a complicated system
|
||||
for prefixes, with different types of prefixes for words that
|
||||
parsed ahead (e.g., strings) and words that operated on the
|
||||
current token (e.g., `@`). Retro 12 eliminates all of these in
|
||||
favor of just having a single prefix model.
|
||||
|
||||
The first thing `interpret` does is look to see if the first
|
||||
character in a token matches a `prefix:` word. If it does, it
|
||||
passes the rest of the token as a string pointer to the prefix
|
||||
specific handler to deal with. If there is no valid prefix
|
||||
found, it tries to find it in the dictionary. Assuming that it
|
||||
finds the words, it passes the `d:xt` field to the handler that
|
||||
`d:class` points to. Otherwise it calls `err:notfound`.
|
||||
|
||||
This has an important implication: *words can not reliably
|
||||
have names that start with a prefix character.*
|
||||
|
||||
It also simplifies things. Anything that would normally parse
|
||||
becomes a prefix handler. So creating a new word? Use the `:`
|
||||
prefix. Strings? Use `'`. Pointers? Try `&`. And so on. E.g.,
|
||||
|
||||
In ANS | In Retro
|
||||
: foo ... ; | :foo ... ;
|
||||
' foo | &foo
|
||||
: bar ... ['] foo ; | :bar ... &foo ;
|
||||
s" hello world!" | 'hello_world!
|
||||
|
||||
If you are familiar with ColorForth, prefixes are a similar
|
||||
idea to colors, but can be defined by the user as normal words.
|
||||
|
||||
After doing this for quite a while I rather like it. I can see
|
||||
why Chuck Moore eventually went towards ColorForth as using
|
||||
color (or prefixes in my case) does simplify the implementation
|
||||
in many ways.
|
||||
|
72
doc/papers/Notes_on_the_Evolution_of_Ngaro_to_Nga.txt
Normal file
72
doc/papers/Notes_on_the_Evolution_of_Ngaro_to_Nga.txt
Normal file
|
@ -0,0 +1,72 @@
|
|||
When I decided to begin work on what became Retro 12, I knew
|
||||
the process would involve updating Ngaro, the virtual machine
|
||||
that Retro 10 and 11 ran on.
|
||||
|
||||
Ngaro rose out of an earlier experimental virtual machine I had
|
||||
written back in 2005-2006. This earlier VM, called Maunga, was
|
||||
very close to what Ngaro ended up being, though it had a very
|
||||
different approach to I/O. (All I/O in Maunga was intended to be
|
||||
memory mapped; Ngaro adopted a port based I/O system).
|
||||
|
||||
Ngaro itself evolved along with Retro, gaining features like
|
||||
automated skipping of NOPs and a LOOP opcode to help improve
|
||||
performance. But the I/O model proved to be a problem. When I
|
||||
created Ngaro, I had the idea that I would always be able to
|
||||
assume a console/terminal style environment. The assumption was
|
||||
that all code would be entered via the keyboard (or maybe a
|
||||
block editor), and that proved to be the fundamental flaw as
|
||||
time went on.
|
||||
|
||||
As Retro grew it was evident that the model had some serious
|
||||
problems. Need to load code from a file? The VM and language had
|
||||
functionality to pretend it was being typed in. Want to run on
|
||||
something like a browser, Android, or iOS? The VM would need to
|
||||
be implemented in a way that simulates input being typed into
|
||||
the VM via a simulated keyboard. And Retro was built around this.
|
||||
I couldn't change it because of a promise to maintain, as much
|
||||
as possible, source compatibility for a period of at least five
|
||||
years.
|
||||
|
||||
When the time came to fix this, I decided at the start to keep
|
||||
the I/O model separate from the core VM. I also decided that the
|
||||
core Retro language would provide some means of interpreting
|
||||
code without requiring an assumption that a traditional terminal
|
||||
was being used.
|
||||
|
||||
So Nga began. I took the opportunity to simplify the instruction
|
||||
set to just 26 essential instructions, add support for packing
|
||||
multiple instructions per memory location (allowing a long due
|
||||
reduction in memory footprint), and to generally just make a far
|
||||
simpler design.
|
||||
|
||||
I've been pleased with Nga. On its own it really isn't useful
|
||||
though. So with Retro I embed it into a larger framework that
|
||||
adds some basic I/O functionality. The *interfaces* handle the
|
||||
details of passing tokens into the language and capturing any
|
||||
output. They are free to do this in whatever model makes most
|
||||
sense on a given platform.
|
||||
|
||||
So far I've implemented:
|
||||
|
||||
- a scripting interface, reading input from a file and
|
||||
offering file i/o, gopher, and reading from stdin, and
|
||||
sending output to stdout.
|
||||
- an interactive interface, built around ncurses, reading
|
||||
input from stdin, and displaying output to a scrolling
|
||||
buffer.
|
||||
- an iOS interface, built around a text editor, directing
|
||||
output to a separate interface pane.
|
||||
- an interactive block editor, using a gopher-based block
|
||||
data store. Output is displayed to stdout, and input is
|
||||
done via the blocks being evaluated or by reading from
|
||||
stdin.
|
||||
|
||||
In all cases, the only common I/O word that has to map to an
|
||||
exposed instruction is `putc`, to display a single character to
|
||||
some output device. There is no requirement for a traditional
|
||||
keyboard input model.
|
||||
|
||||
By doing this I was able to solve the biggest portability issue
|
||||
with the Retro 10/11 model, and make a much simpler, cleaner
|
||||
language in the end.
|
||||
|
64
doc/papers/Notes_on_the_Kernel_Wordset.txt
Normal file
64
doc/papers/Notes_on_the_Kernel_Wordset.txt
Normal file
|
@ -0,0 +1,64 @@
|
|||
In implementing the Retro 12 kernel (called Rx) I had to decide
|
||||
on what functionality would be needed. It was important to me
|
||||
that this be kept clean and minimalistic, as I didn't want to
|
||||
spend a lot of time changing it as time progressed. It's far
|
||||
nicer to code at the higher level, where the Retro language is
|
||||
functional, as opposed to writing more assembly code.
|
||||
|
||||
So what made it in?
|
||||
|
||||
Primitives
|
||||
|
||||
These are words that map directly to Nga instructions.
|
||||
|
||||
dup drop swap call eq? -eq? lt? gt?
|
||||
fetch store + - * /mod and or
|
||||
xor shift push pop 0;
|
||||
|
||||
Memory
|
||||
|
||||
fetch-next store-next , s,
|
||||
|
||||
Strings
|
||||
|
||||
s:to-number s:eq? s:length
|
||||
|
||||
Flow Control
|
||||
|
||||
choose if -if repeat again
|
||||
|
||||
Compiler & Interpreter
|
||||
|
||||
Compiler Heap ; [ ] Dictionary
|
||||
d:link d:class d:xt d:name d:add-header
|
||||
class:word class:primitive class:data class:macro
|
||||
prefix:: prefix:# prefix:& prefix:$
|
||||
interpret d:lookup err:notfound
|
||||
|
||||
I *could* slightly reduce this. The $ prefix could be defined in
|
||||
higher level code, and I don't strictly *need* to expose the
|
||||
`fetch-next` and `store-next` here. But since the are already
|
||||
implemented as dependencies of the words in the kernel, it would
|
||||
be a bit wasteful to redefine them later in higher level code.
|
||||
|
||||
With these words the rest of the language can be built up. Note
|
||||
that the Rx kernel does not provide any I/O words. It's assumed
|
||||
that the Retro interfaces will add these as best suited for the
|
||||
systems they run on.
|
||||
|
||||
There is another small bit. All images start with a few key
|
||||
pointers in fixed offsets of memory. These are:
|
||||
|
||||
| Offset | Contains |
|
||||
| ------ | --------------------------- |
|
||||
| 0 | lit call nop nop |
|
||||
| 1 | Pointer to main entry point |
|
||||
| 2 | Dictionary |
|
||||
| 3 | Heap |
|
||||
| 4 | Retro version identifier |
|
||||
|
||||
An interface can use the dictionary pointer and knowledge of the
|
||||
dictionary format for a specific Retro version to identify the
|
||||
location of essential words like `interpret` and `err:notfound`
|
||||
when implementing the user facing interface.
|
||||
|
79
doc/papers/Retro_11_(2011-2019):_A_Look_Back.md
Normal file
79
doc/papers/Retro_11_(2011-2019):_A_Look_Back.md
Normal file
|
@ -0,0 +1,79 @@
|
|||
# Retro 11 (2011 - 2019): A Look Back
|
||||
|
||||
So it's now been about five years since the last release of
|
||||
Retro 11. While I still see some people obtaining and using it,
|
||||
I've moved on to the twelth generation of Retro. It's time for
|
||||
me to finally retire Retro 11.
|
||||
|
||||
As I prepare to do so, I thought I'd take a brief look back.
|
||||
|
||||
Retro 11 began life in 2011. It grew out of Retro 10, which
|
||||
was the first version of Retro to not be written in x86 assembly
|
||||
language. For R10 and R11, I wrote a portable virtual machine
|
||||
(with numerous implementations) and the Forth dialect was kept
|
||||
in an image file which ran on the VM.
|
||||
|
||||
Retro 10 worked, but was always a bit too sloppy and changed
|
||||
drastically between releases. The major goal of Retro 11 was
|
||||
to provide a stable base for a five year period. In retrospect,
|
||||
this was mostly achieved. Code from earlier releases normally
|
||||
needed only minor adjustments to run on later releases, though
|
||||
newer releases added significantly to the language.
|
||||
|
||||
There were seven releases.
|
||||
|
||||
- Release 11.0: 2011, July
|
||||
- Release 11.1: 2011, November
|
||||
- Release 11.2: 2012, January
|
||||
- Release 11.3: 2012, March
|
||||
- Release 11.4: 2012, July
|
||||
- Release 11.5: 2013, March
|
||||
- Release 11.6: 2014, August
|
||||
|
||||
Development was fast until 11.4. This was the point at which
|
||||
I had to slow down due to RSI problems. It was also the point
|
||||
which I started experiencing some problems with the metacompiler
|
||||
(as discussed previously).
|
||||
|
||||
Retro 11 was flexible. All colon definitions were setup as hooks,
|
||||
allowing new functionality to be layered in easily. This allowed
|
||||
the later releases to add things like vocabularies, search order,
|
||||
tab completion, and keyboard remapping. This all came at a cost
|
||||
though: later things could use the hooks to alter behavior of
|
||||
existing words, so it was necessary to use a lot of caution to
|
||||
ensure that the layers didn't break the earlier code.
|
||||
|
||||
The biggest issue was the I/O model. Retro 11 and the Ngaro VM
|
||||
assumed the existence of a console environment. All input was
|
||||
required to be input at the keyboard, and all output was to be
|
||||
shown on screen. This caused some problems. Including code from
|
||||
a file required some tricks, temporarily rewriting the keyboard
|
||||
input function to read from the file. It also became a major
|
||||
issue when I wrote the iOS version. The need to simulate the
|
||||
keyboard and console complicated everything and I had to spend
|
||||
a considerable amount of effort to deal with battery performance
|
||||
resulting from the I/O polling and wait states.
|
||||
|
||||
But on the whole it worked well. I used Retro 11.6 until I
|
||||
started work on Retro 12 in late 2016, and continued running
|
||||
some tools written in R11 until the first quarter of last year.
|
||||
|
||||
The final image file was 23,137 cells (92,548 bytes). This was
|
||||
bloated by keeping some documentation (stack comments and short
|
||||
descriptions) in the image, which started in 11.4. This contained
|
||||
269 words.
|
||||
|
||||
I used Retro 11 for a wide variety of tasks. A small selection
|
||||
of things that were written includes:
|
||||
|
||||
- a pastebin
|
||||
- front end to ii (irc client)
|
||||
- small explorations of interactive fiction
|
||||
- irc log viewer
|
||||
- tool to create html from templates
|
||||
- tool to automate creation of an SVCD from a set of photos
|
||||
- tools to generate reports from data sets for my employer
|
||||
|
||||
In the end, I'm happy with how Retro 11 turned out. I made some
|
||||
mistakes in embracing too much complexity, but despite this it
|
||||
was a successful system for many years.
|
126
doc/papers/The_Path_to_Self_Hosting.md
Normal file
126
doc/papers/The_Path_to_Self_Hosting.md
Normal file
|
@ -0,0 +1,126 @@
|
|||
# The Path to Self Hosting
|
||||
|
||||
Retro is an image based Forth system running on a lightweight
|
||||
virtual machine. This is the story of how that image is made.
|
||||
|
||||
The first Retro to use an image based approach was Retro 10.
|
||||
The earliest images were built using a compiler written in
|
||||
Toka, an earlier experimental stack language I had written.
|
||||
It didn't take long to want to drop the dependency on Toka,
|
||||
so I rewrote the image compiler in Retro and then began
|
||||
development at a faster pace.
|
||||
|
||||
Retro 11 was built using the last Retro 10 image and an
|
||||
evolved version of the metacompiler. This worked well, but
|
||||
I eventually found it to be problematic.
|
||||
|
||||
One of the issueo sI faced was the inability to make a new
|
||||
image from the prior stable release. Since I develop and
|
||||
test changes incrementally, I reached a point where the
|
||||
current metacompiler and image required each other. This
|
||||
wasn't a fatal flaw, but it was annoying.
|
||||
|
||||
Perhaps more critical was the fragility of the system. In
|
||||
R11 small mistakes could result in a corrupt image. The test
|
||||
suite helped identify some of these, but there were a few
|
||||
times I was forced to dig back through the version control
|
||||
history to recover a working image.
|
||||
|
||||
The fragile nature was amplified by some design decisions.
|
||||
In R11, after the initial kernel was built, it would be
|
||||
moved to memory address 0, then control would jump into the
|
||||
new kernel to finish building the higher level parts.
|
||||
|
||||
Handling this was a tricky task. In R11 almost everything
|
||||
could be revectored, so the metacompiler had to ensure that
|
||||
it didn't rely on anything in the old image during the move.
|
||||
This caused a large number of issues over R11's life.
|
||||
|
||||
So on to Retro 12. I decided that this would be different.
|
||||
First, the kernel would be assembly, with an external tool
|
||||
to generate the core image. The kernel is in `Rx.md` and the
|
||||
assembler is `Muri`. To load the standard library, I wrote a
|
||||
second tool, `retro-extend`. This separation has allowed me
|
||||
many fewer headaches as I can make changes more easily and
|
||||
rebuild from scratch when necessary.
|
||||
|
||||
But I miss self-hosting. So last fall I decided to resolve
|
||||
this. And today I'm pleased to say that it is now done.
|
||||
|
||||
There are a few parts to this.
|
||||
|
||||
**Unu**. I use a Markdown variation with fenced code blocks.
|
||||
The tool I wrote in C to extract these is called `unu`. For
|
||||
a self hosting Retro, I rewrote this as a combinator that
|
||||
reads in a file and runs another word against each line in the
|
||||
file. So I could display the code block contents by doing:
|
||||
|
||||
'filename [ s:put nl ] unu
|
||||
|
||||
This made it easier to implement the other tools.
|
||||
|
||||
**Muri**. This is my assembler. It's minimalistic, fast, and
|
||||
works really well for my purposes. Retro includes a runtime
|
||||
version of this (using `as{`, `}as`, `i`, `d`, and `r`), so
|
||||
all I needed for this was to write a few words to parse the
|
||||
lines and run the corresponding runtime words. As with the C
|
||||
version, this is a two pass assembler.
|
||||
|
||||
Muri generates a new `ngaImage` with the kernel. To create a
|
||||
full image I needed a way to load in the standard library and
|
||||
I/O extensions.
|
||||
|
||||
This is handled by **retro-extend**. This is where it gets
|
||||
more complex. I implemented the Nga virtual machine in Retro
|
||||
to allow this to run the new image in isolation from the
|
||||
host image. The new ngaImage is loaded, the interpreter is
|
||||
located, and each token is passed to the interpreter. Once
|
||||
done, the new image is written to disk.
|
||||
|
||||
So at this point I'm pleased to say that I can now develop
|
||||
Retro using only an existing copy of Retro (VM+image) and
|
||||
tools (unu, muri, retro-extend, and a line oriented text
|
||||
editor) written in Retro.
|
||||
|
||||
This project has delivered some additional side benefits.
|
||||
During the testing I was able to use it to identify a few
|
||||
bugs in the I/O extensions, and the Nga-in-Retro will replace
|
||||
the older attempt at this in the debugger, allowing a safer
|
||||
testing environment.
|
||||
|
||||
What issues remain?
|
||||
|
||||
The extend process is *slow*. On my main development server
|
||||
(Linode 1024, OpenBSD 6.4, 64-bit) it takes a bit over five
|
||||
minutes to complete loading the standard library, and a few
|
||||
additional depending on the I/O drivers selected.
|
||||
|
||||
Most of the performance issues come from running Nga-in-Retro
|
||||
to isolate the new image from the host one. It'd be possible
|
||||
to do something a bit more clever (e.g., running a Retro
|
||||
instance using the new image via a subprocess and piping in
|
||||
the source, or doing relocations of the data), but this is
|
||||
less error prone and will work on all systems that I plan to
|
||||
support (including, with a few minor adjustments, the native
|
||||
hardware versions [assuming the existance of mass storage]).
|
||||
|
||||
Sources:
|
||||
|
||||
**Unu**
|
||||
|
||||
http://forth.works/c8820f85e0c52d32c7f9f64c28f435c0
|
||||
|
||||
gopher://forth.works/0/c8820f85e0c52d32c7f9f64c28f435c0
|
||||
|
||||
**Muri**
|
||||
|
||||
http://forth.works/09d6c4f3f8ab484a31107dca780058e3
|
||||
|
||||
gopher://forth.works/0/09d6c4f3f8ab484a31107dca780058e3
|
||||
|
||||
**retro-extend**
|
||||
|
||||
http://forth.works/c812416f397af11db58e97388a3238f2
|
||||
|
||||
gopher://forth.works/0/c812416f397af11db58e97388a3238f2
|
||||
|
Loading…
Reference in a new issue