retroforth/book/Programming-Techniques-Working-With-Strings
crc 119d6746fd book: add details on a few more string words
FossilOrigin-Name: ed31f6858899516b149f6f35d5cab1abda3b78acc3e571c8f2163d05bcc64848
2019-03-19 12:47:51 +00:00

128 lines
2.6 KiB
Text

# Working With Strings
Strings in RETRO are NULL terminated sequences of values
representing characters. Being NULL terminated, they can't
contain a NULL (ASCII 0).
The character words in RETRO are built around ASCII, but
strings can contain UTF8 encoded data if the host platform
allows. Words like `s:length` will return the number of bytes,
not the number of logical characters in this case.
## Prefix
Strings begin with a single `'`.
```
'Hello
'This_is_a_string
'This_is_a_much_longer_string_12345_67890_!!!
```
RETRO will replace spaces with underscores. If you need both
spaces and underscores in a string, escape the underscores and
use `s:format`:
```
'This_has_spaces_and_under\_scored_words. s:format
```
## Namespace
Words operating on strings are in the `s:` namespace.
## Lifetime
At the interpreter, strings get allocated in a rotating buffer.
This is used by the words operating on strings, so if you need
to keep them around, use `s:keep` or `s:copy` to move them to
more permanent storage.
In a definition, the string is compiled inline and so is in
permanent memory.
You can manually manage the string lifetime by using `s:keep`
to place it into permanent memory or `s:temp` to copy it to
the rotating buffer.
## Mutability
Strings are mutable. If you need to ensure that a string is
not altered, make a copy before operating on it or see the
individual glossary entries for notes on words that may do
this automatically.
## Searching
RETRO provides four words for searching within a string.
`s:contains-char?`
`s:contains-string?`
`s:index-of`
`s:index-of-string`
## Comparisons
`s:eq?`
`s:case`
## Extraction
`s:left`
`s:right`
`s:substr`
## Joining
You can use `s:append` or `s:prepend` to merge two strings.
```
'First 'Second s:append
'Second 'First s:prepend
```
## Tokenization
`s:tokenize`
`s:tokenize-on-string`
`s:split`
`s:split-on-string`
## Conversions
To convert the case of a string, RETRO provides `s:to-lower`
and `s:to-upper`.
`s:to-number` is provided to convert a string to an integer
value. This has a few limitations:
- only supports decimal
- non-numeric characters will result in incorrect values
## Cleanup
RETRO provides a handful of words for cleaning up strings.
`s:chop` will remove the last character from a string. This
is done by replacing it with an ASCII:NULL.
`s:trim` removes leading and trailing whitespace from a string.
For more control, there is also `s:trim-left` and `s:trim-right`
which let you trim just the leading or trailing end as desired.
## Combinators
`s:for-each`
`s:filter`
`s:map`
## Other
`s:evaluate`
`s:copy`
`s:reverse`
`s:hash`
`s:length`
`s:replace`
`s:format`
`s:empty`