retroforth/doc/book/techniques/strings
crc b2ac237c35 update documentation
FossilOrigin-Name: 61bec5c3ab74da3be2fa090a346b667bfa03cea534d8b378b9ac3245412bdc72
2020-01-07 14:09:08 +00:00

176 lines
3.9 KiB
Text

# Working With Strings
Strings in RETRO are NULL terminated sequences of values
representing characters. Being NULL terminated, they can't
contain a NULL (ASCII 0).
The character words in RETRO are built around ASCII, but
strings can contain UTF8 encoded data if the host platform
allows. Words like `s:length` will return the number of bytes,
not the number of logical characters in this case.
## Prefix
Strings begin with a single `'`.
```
'Hello
'This_is_a_string
'This_is_a_much_longer_string_12345_67890_!!!
```
RETRO will replace spaces with underscores. If you need both
spaces and underscores in a string, escape the underscores and
use `s:format`:
```
'This_has_spaces_and_under\_scored_words. s:format
```
## Namespace
Words operating on strings are in the `s:` namespace.
## Lifetime
At the interpreter, strings get allocated in a rotating buffer.
This is used by the words operating on strings, so if you need
to keep them around, use `s:keep` or `s:copy` to move them to
more permanent storage.
In a definition, the string is compiled inline and so is in
permanent memory.
You can manually manage the string lifetime by using `s:keep`
to place it into permanent memory or `s:temp` to copy it to
the rotating buffer.
## Mutability
Strings are mutable. If you need to ensure that a string is
not altered, make a copy before operating on it or see the
individual glossary entries for notes on words that may do
this automatically.
## Searching
RETRO provides four words for searching within a string.
`s:contains-char?`
`s:contains-string?`
`s:index-of`
`s:index-of-string`
## Comparisons
`s:eq?`
`s:case`
## Extraction
To obtain a new string containing the first `n` characters from
a source string, use `s:left`:
```
'Hello_World #5 s:left
```
To obtain a new string containing the last `n` characters from
a source string, use `s:right`:
```
'Hello_World #5 s:right
```
If you need to extract data from the middle of the string, use
`s:substr`. This takes a string, the offset of the first
character, and the number of characters to extract.
```
'Hello_World #3 #5 s:substr
```
## Joining
You can use `s:append` or `s:prepend` to merge two strings.
```
'First 'Second s:append
'Second 'First s:prepend
```
## Tokenization
`s:tokenize`
`s:tokenize-on-string`
`s:split`
`s:split-on-string`
## Conversions
To convert the case of a string, RETRO provides `s:to-lower`
and `s:to-upper`.
`s:to-number` is provided to convert a string to an integer
value. This has a few limitations:
- only supports decimal
- non-numeric characters will result in incorrect values
## Cleanup
RETRO provides a handful of words for cleaning up strings.
`s:chop` will remove the last character from a string. This
is done by replacing it with an ASCII:NULL.
`s:trim` removes leading and trailing whitespace from a string.
For more control, there is also `s:trim-left` and `s:trim-right`
which let you trim just the leading or trailing end as desired.
## Combinators
`s:for-each`
`s:filter`
`s:map`
## Other
`s:evaluate`
`s:copy`
`s:reverse`
`s:hash`
`s:length`
`s:replace`
`s:format`
`s:empty`
## Controlling The Temporary Buffers
As dicussed in the Lifetime subsection, temporary strings are
allocated in a rotating buffer. The details of this can be
altered by updating two variables.
| Variable | Holds |
| ------------- | ---------------------------------------- |
| TempStrings | The number of temporary strings |
| TempStringMax | The maximum length of a temporary string |
For example, to increase the number of temporary strings to
48:
```
#48 !TempStrings
```
The defaults are:
| Variable | Default |
| ------------- | ------- |
| TempStrings | 32 |
| TempStringMax | 512 |
It's also important to note that altering these will affect
the memory map for all temporary buffers. Do not use anything
already in the buffers after updating these or you will risk
data corruption and possible crashes.