retroforth/future/utf8.retro
crc bcd5c92b9c small cleanups, remove redundant file
FossilOrigin-Name: e3bff1685f168692bc4a1239ebd44c66c9e35442bcf41efe0399b8e806824a8b
2021-09-10 17:55:07 +00:00

56 lines
1 KiB
Forth
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# UTF-8 Characters
UTF-8 allows for characters to be one to four bytes long. Since Retro
is 32-bits internally, all characters can fit into a sincle entry on
the stack. These words will be used to pack and unpack the character
values.
~~~
:uc:pack (????n-c) ;
:uc:unpack (c-????n) ;
~~~
# UTF-8 Strings
Strings in Retro have been C-style null terminated sequences of ASCII
characters. I'm seeking to change this as I'd like to support Unicode
(UTF-8) and to merge much of the string and array handling code.
This will be an ongoing process.
Temporary sigil.
~~~
:sigil:" (-a) a:from-string class:data ; immediate
~~~
Return the length (in utf8 characters or bytes) of a string.
~~~
:us:length (a-n) #0 swap [ #192 and #128 -eq? + ] a:for-each n:abs ;
:us:length/bytes (a-n) a:length ;
~~~
~~~
~~~
Fetch a character from a string.
~~~
:us:fetch (an-c) ;
~~~
Store a character into a string.
~~~
:us:store (can-) ;
~~~
Tests.
```
"((VV)=V)/V←,V us:length n:put nl
"((VV)=V)/V←,V us:length/bytes n:put nl
```