retroforth/doc/book/tech-notes/self-hosting

## The Path to Self Hosting

Retro is an image based Forth system running on a lightweight
virtual machine. This is the story of how that image is made.

The first Retro to use an image based approach was Retro 10.
The earliest images were built using a compiler written in
Toka, an earlier experimental stack language I had written.
It didn't take long to want to drop the dependency on Toka,
so I rewrote the image compiler in Retro and then began
development at a faster pace.

Retro 11 was built using the last Retro 10 image and an
evolved version of the metacompiler. This worked well, but
I eventually found it to be problematic.

One of the issues I faced was the inability to make a new
image from the prior stable release. Since I develop and
test changes incrementally, I reached a point where the
current metacompiler and image required each other. This
wasn't a fatal flaw, but it was annoying.

Perhaps more critical was the fragility of the system. In
R11 small mistakes could result in a corrupt image. The test
suite helped identify some of these, but there were a few
times I was forced to dig back through the version control
history to recover a working image.

The fragile nature was amplified by some design decisions.
In R11, after the initial kernel was built, it would be
moved to memory address 0, then control would jump into the
new kernel to finish building the higher level parts.

Handling this was a tricky task. In R11 almost everything
could be revectored, so the metacompiler had to ensure that
it didn't rely on anything in the old image during the move.
This caused a large number of issues over R11's life.

So on to Retro 12. I decided that this would be different.
First, the kernel would be assembly, with an external tool
to generate the core image. The kernel is in `Rx.md` and the
assembler is `Muri`. To load the standard library, I wrote a
second tool, `Retro-extend`. This separation has allowed me
many fewer headaches as I can make changes more easily and
rebuild from scratch when necessary.

But I miss self-hosting. So last fall I decided to resolve
this. And today I'm pleased to say that it is now done.

There are a few parts to this.

**Unu**. I use a Markdown variation with fenced code blocks.
The tool I wrote in C to extract these is called `unu`. For
a self hosting Retro, I rewrote this as a combinator that
reads in a file and runs another word against each line in the
file. So I could display the code block contents by doing:

    'filename [ s:put nl ] unu

This made it easier to implement the other tools.

**Muri**. This is my assembler. It's minimalistic, fast, and
works really well for my purposes. Retro includes a runtime
version of this (using `as{`, `}as`, `i`, `d`, and `r`), so
all I needed for this was to write a few words to parse the
lines and run the corresponding runtime words. As with the C
version, this is a two pass assembler.

Muri generates a new `ngaImage` with the kernel. To create a
full image I needed a way to load in the standard library and
I/O extensions.

This is handled by **retro-extend**. This is where it gets
more complex. I implemented the Nga virtual machine in Retro
to allow this to run the new image in isolation from the
host image. The new ngaImage is loaded, the interpreter is
located, and each token is passed to the interpreter. Once
done, the new image is written to disk.

So at this point I'm pleased to say that I can now develop
Retro using only an existing copy of Retro (VM+image) and
tools (unu, muri, retro-extend, and a line oriented text
editor) written in Retro.

This project has delivered some additional side benefits.
During the testing I was able to use it to identify a few
bugs in the I/O extensions, and the Nga-in-Retro will replace
the older attempt at this in the debugger, allowing a safer
testing environment.

What issues remain?

The extend process is *slow*. On my main development server
(Linode 1024, OpenBSD 6.4, 64-bit) it takes a bit over five
minutes to complete loading the standard library, and a few
additional depending on the I/O drivers selected.

Most of the performance issues come from running Nga-in-Retro
to isolate the new image from the host one. It'd be possible
to do something a bit more clever (e.g., running a Retro
instance using the new image via a subprocess and piping in
the source, or doing relocations of the data), but this is
less error prone and will work on all systems that I plan to
support (including, with a few minor adjustments, the native
hardware versions [assuming the existance of mass storage]).

Sources:

**Unu**

- http://forth.works/c8820f85e0c52d32c7f9f64c28f435c0
- gopher://forth.works/0/c8820f85e0c52d32c7f9f64c28f435c0

**Muri**

- http://forth.works/09d6c4f3f8ab484a31107dca780058e3
- gopher://forth.works/0/09d6c4f3f8ab484a31107dca780058e3

**retro-extend**

- http://forth.works/c812416f397af11db58e97388a3238f2
- gopher://forth.works/0/c812416f397af11db58e97388a3238f2
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`## The Path to Self Hosting`

begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`Retro is an image based Forth system running on a lightweight`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`virtual machine. This is the story of how that image is made.`

begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`The first Retro to use an image based approach was Retro 10.`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`The earliest images were built using a compiler written in`
			`Toka, an earlier experimental stack language I had written.`
			`It didn't take long to want to drop the dependency on Toka,`
begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`so I rewrote the image compiler in Retro and then began`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`development at a faster pace.`

begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`Retro 11 was built using the last Retro 10 image and an`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`evolved version of the metacompiler. This worked well, but`
			`I eventually found it to be problematic.`

			`One of the issues I faced was the inability to make a new`
			`image from the prior stable release. Since I develop and`
			`test changes incrementally, I reached a point where the`
			`current metacompiler and image required each other. This`
			`wasn't a fatal flaw, but it was annoying.`

			`Perhaps more critical was the fragility of the system. In`
			`R11 small mistakes could result in a corrupt image. The test`
			`suite helped identify some of these, but there were a few`
			`times I was forced to dig back through the version control`
			`history to recover a working image.`

			`The fragile nature was amplified by some design decisions.`
			`In R11, after the initial kernel was built, it would be`
			`moved to memory address 0, then control would jump into the`
			`new kernel to finish building the higher level parts.`

			`Handling this was a tricky task. In R11 almost everything`
			`could be revectored, so the metacompiler had to ensure that`
			`it didn't rely on anything in the old image during the move.`
			`This caused a large number of issues over R11's life.`

begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`So on to Retro 12. I decided that this would be different.`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`First, the kernel would be assembly, with an external tool`
			to generate the core image. The kernel is in `Rx.md` and the
			assembler is `Muri`. To load the standard library, I wrote a
begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			second tool, `Retro-extend`. This separation has allowed me
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`many fewer headaches as I can make changes more easily and`
			`rebuild from scratch when necessary.`

			`But I miss self-hosting. So last fall I decided to resolve`
			`this. And today I'm pleased to say that it is now done.`

			`There are a few parts to this.`

			`Unu. I use a Markdown variation with fenced code blocks.`
			The tool I wrote in C to extract these is called `unu`. For
begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`a self hosting Retro, I rewrote this as a combinator that`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`reads in a file and runs another word against each line in the`
			`file. So I could display the code block contents by doing:`

			`'filename [ s:put nl ] unu`

			`This made it easier to implement the other tools.`

			`Muri. This is my assembler. It's minimalistic, fast, and`
begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`works really well for my purposes. Retro includes a runtime`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			version of this (using `as{`, `}as`, `i`, `d`, and `r`), so
			`all I needed for this was to write a few words to parse the`
			`lines and run the corresponding runtime words. As with the C`
			`version, this is a two pass assembler.`

			Muri generates a new `ngaImage` with the kernel. To create a
			`full image I needed a way to load in the standard library and`
			`I/O extensions.`

			`This is handled by retro-extend. This is where it gets`
begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`more complex. I implemented the Nga virtual machine in Retro`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`to allow this to run the new image in isolation from the`
			`host image. The new ngaImage is loaded, the interpreter is`
			`located, and each token is passed to the interpreter. Once`
			`done, the new image is written to disk.`

			`So at this point I'm pleased to say that I can now develop`
begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`Retro using only an existing copy of Retro (VM+image) and`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`tools (unu, muri, retro-extend, and a line oriented text`
begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`editor) written in Retro.`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00
			`This project has delivered some additional side benefits.`
			`During the testing I was able to use it to identify a few`
begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`bugs in the I/O extensions, and the Nga-in-Retro will replace`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`the older attempt at this in the debugger, allowing a safer`
			`testing environment.`

			`What issues remain?`

			`The extend process is slow. On my main development server`
			`(Linode 1024, OpenBSD 6.4, 64-bit) it takes a bit over five`
			`minutes to complete loading the standard library, and a few`
			`additional depending on the I/O drivers selected.`

begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`Most of the performance issues come from running Nga-in-Retro`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`to isolate the new image from the host one. It'd be possible`
begin working on making terminology around the Retro naming consistent FossilOrigin-Name: 9ec7d6dee1b22e748cd1f00886b5c2ed76e4b6138c131f699cbbb0c640c561a3 2021-05-12 15:57:22 +02:00			`to do something a bit more clever (e.g., running a Retro`
update documentation FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72 2020-01-07 15:09:08 +01:00			`instance using the new image via a subprocess and piping in`
			`the source, or doing relocations of the data), but this is`
			`less error prone and will work on all systems that I plan to`
			`support (including, with a few minor adjustments, the native`
			`hardware versions [assuming the existance of mass storage]).`

			`Sources:`

			`Unu`

			`- http://forth.works/c8820f85e0c52d32c7f9f64c28f435c0`
			`- gopher://forth.works/0/c8820f85e0c52d32c7f9f64c28f435c0`

			`Muri`

			`- http://forth.works/09d6c4f3f8ab484a31107dca780058e3`
			`- gopher://forth.works/0/09d6c4f3f8ab484a31107dca780058e3`

			`retro-extend`

			`- http://forth.works/c812416f397af11db58e97388a3238f2`
			`- gopher://forth.works/0/c812416f397af11db58e97388a3238f2`