bcd5c92b9c
FossilOrigin-Name: e3bff1685f168692bc4a1239ebd44c66c9e35442bcf41efe0399b8e806824a8b
56 lines
1 KiB
Forth
56 lines
1 KiB
Forth
# UTF-8 Characters
|
||
|
||
UTF-8 allows for characters to be one to four bytes long. Since Retro
|
||
is 32-bits internally, all characters can fit into a sincle entry on
|
||
the stack. These words will be used to pack and unpack the character
|
||
values.
|
||
|
||
~~~
|
||
:uc:pack (????n-c) ;
|
||
:uc:unpack (c-????n) ;
|
||
~~~
|
||
|
||
# UTF-8 Strings
|
||
|
||
Strings in Retro have been C-style null terminated sequences of ASCII
|
||
characters. I'm seeking to change this as I'd like to support Unicode
|
||
(UTF-8) and to merge much of the string and array handling code.
|
||
|
||
This will be an ongoing process.
|
||
|
||
Temporary sigil.
|
||
|
||
~~~
|
||
:sigil:" (-a) a:from-string class:data ; immediate
|
||
~~~
|
||
|
||
Return the length (in utf8 characters or bytes) of a string.
|
||
|
||
~~~
|
||
:us:length (a-n) #0 swap [ #192 and #128 -eq? + ] a:for-each n:abs ;
|
||
:us:length/bytes (a-n) a:length ;
|
||
~~~
|
||
|
||
~~~
|
||
~~~
|
||
|
||
|
||
Fetch a character from a string.
|
||
|
||
~~~
|
||
:us:fetch (an-c) ;
|
||
~~~
|
||
|
||
Store a character into a string.
|
||
|
||
~~~
|
||
:us:store (can-) ;
|
||
~~~
|
||
|
||
Tests.
|
||
|
||
```
|
||
"((V⍳V)=⍳⍴V)/V←,V us:length n:put nl
|
||
"((V⍳V)=⍳⍴V)/V←,V us:length/bytes n:put nl
|
||
```
|
||
|