retroforth/doc/book/tech-notes/self-hosting

123 lines
4.9 KiB
Text
Raw Normal View History

## The Path to Self Hosting
Retro is an image based Forth system running on a lightweight
virtual machine. This is the story of how that image is made.
The first Retro to use an image based approach was Retro 10.
The earliest images were built using a compiler written in
Toka, an earlier experimental stack language I had written.
It didn't take long to want to drop the dependency on Toka,
so I rewrote the image compiler in Retro and then began
development at a faster pace.
Retro 11 was built using the last Retro 10 image and an
evolved version of the metacompiler. This worked well, but
I eventually found it to be problematic.
One of the issues I faced was the inability to make a new
image from the prior stable release. Since I develop and
test changes incrementally, I reached a point where the
current metacompiler and image required each other. This
wasn't a fatal flaw, but it was annoying.
Perhaps more critical was the fragility of the system. In
R11 small mistakes could result in a corrupt image. The test
suite helped identify some of these, but there were a few
times I was forced to dig back through the version control
history to recover a working image.
The fragile nature was amplified by some design decisions.
In R11, after the initial kernel was built, it would be
moved to memory address 0, then control would jump into the
new kernel to finish building the higher level parts.
Handling this was a tricky task. In R11 almost everything
could be revectored, so the metacompiler had to ensure that
it didn't rely on anything in the old image during the move.
This caused a large number of issues over R11's life.
So on to Retro 12. I decided that this would be different.
First, the kernel would be assembly, with an external tool
to generate the core image. The kernel is in `Rx.md` and the
assembler is `Muri`. To load the standard library, I wrote a
second tool, `Retro-extend`. This separation has allowed me
many fewer headaches as I can make changes more easily and
rebuild from scratch when necessary.
But I miss self-hosting. So last fall I decided to resolve
this. And today I'm pleased to say that it is now done.
There are a few parts to this.
**Unu**. I use a Markdown variation with fenced code blocks.
The tool I wrote in C to extract these is called `unu`. For
a self hosting Retro, I rewrote this as a combinator that
reads in a file and runs another word against each line in the
file. So I could display the code block contents by doing:
'filename [ s:put nl ] unu
This made it easier to implement the other tools.
**Muri**. This is my assembler. It's minimalistic, fast, and
works really well for my purposes. Retro includes a runtime
version of this (using `as{`, `}as`, `i`, `d`, and `r`), so
all I needed for this was to write a few words to parse the
lines and run the corresponding runtime words. As with the C
version, this is a two pass assembler.
Muri generates a new `ngaImage` with the kernel. To create a
full image I needed a way to load in the standard library and
I/O extensions.
This is handled by **retro-extend**. This is where it gets
more complex. I implemented the Nga virtual machine in Retro
to allow this to run the new image in isolation from the
host image. The new ngaImage is loaded, the interpreter is
located, and each token is passed to the interpreter. Once
done, the new image is written to disk.
So at this point I'm pleased to say that I can now develop
Retro using only an existing copy of Retro (VM+image) and
tools (unu, muri, retro-extend, and a line oriented text
editor) written in Retro.
This project has delivered some additional side benefits.
During the testing I was able to use it to identify a few
bugs in the I/O extensions, and the Nga-in-Retro will replace
the older attempt at this in the debugger, allowing a safer
testing environment.
What issues remain?
The extend process is *slow*. On my main development server
(Linode 1024, OpenBSD 6.4, 64-bit) it takes a bit over five
minutes to complete loading the standard library, and a few
additional depending on the I/O drivers selected.
Most of the performance issues come from running Nga-in-Retro
to isolate the new image from the host one. It'd be possible
to do something a bit more clever (e.g., running a Retro
instance using the new image via a subprocess and piping in
the source, or doing relocations of the data), but this is
less error prone and will work on all systems that I plan to
support (including, with a few minor adjustments, the native
hardware versions [assuming the existance of mass storage]).
Sources:
**Unu**
- http://forth.works/c8820f85e0c52d32c7f9f64c28f435c0
- gopher://forth.works/0/c8820f85e0c52d32c7f9f64c28f435c0
**Muri**
- http://forth.works/09d6c4f3f8ab484a31107dca780058e3
- gopher://forth.works/0/09d6c4f3f8ab484a31107dca780058e3
**retro-extend**
- http://forth.works/c812416f397af11db58e97388a3238f2
- gopher://forth.works/0/c812416f397af11db58e97388a3238f2