e4a838d588
FossilOrigin-Name: 03f8a1f5293ceeb245b4e6315578c114c69eb4341027fad8eba5ec8c8dbca5fa
209 lines
9 KiB
HTML
209 lines
9 KiB
HTML
<?xml version="1.0" encoding="utf-8"?>
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml"><head>
|
|
<title>.</title>
|
|
<style type="text/css">
|
|
|
|
* { color: #000; background: #fff; max-width: 700px; }
|
|
tt, pre { background: #dedede; color: #111; font-family: monospace;
|
|
white-space: pre; display: block; width: 100%; }
|
|
.indentedcode { margin-left: 2em; margin-right: 2em; }
|
|
.codeblock {
|
|
background: #dedede; color: #111; font-family: monospace;
|
|
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
|
padding: 7px;
|
|
display: block;
|
|
}
|
|
|
|
.indentedlist { margin-left: 2em; color: #000; }
|
|
|
|
span { white-space: pre; }
|
|
.text { color: #000; white-space: pre; background: #dedede; }
|
|
.colon { color: #000; background: #dedede; }
|
|
.note { color: #000; background: #dedede; }
|
|
.str { color: #000; text-decoration: underline; background: #dedede; }
|
|
.num { color: #000; background: #dedede; font-weight: bold; font-style: italic; }
|
|
.fnum { color: #000; font-weight: bold; background: #dedede; }
|
|
.ptr { color: #000; font-weight: bold; background: #dedede; }
|
|
.fetch { color: #000; font-style: italic; background: #dedede; }
|
|
.store { color: #000; font-style: italic; background: #dedede; }
|
|
.char { color: #000; background: #dedede; }
|
|
.inst { color: #000; background: #dedede; }
|
|
.defer { color: #000; background: #dedede; }
|
|
.imm { color: #000; font-weight: bold; background: #dedede; }
|
|
.prim { color: #000; font-weight: bolder; background: #dedede; }
|
|
|
|
.tt { white-space: pre; font-family: monospace; background: #dedede; }
|
|
|
|
.h1, .h2, .h3, .h4 { white-space: normal; }
|
|
.h1 { font-size: 125%; }
|
|
.h2 { font-size: 120%; }
|
|
.h3 { font-size: 115%; }
|
|
.h4 { font-size: 110%; }
|
|
.hr { display: block; height: 2px; background: #000000; }
|
|
</style>
|
|
</head><body>
|
|
<p><span class="h1">Working With Strings</span>
|
|
<br/><br/>
|
|
Strings in RETRO are NULL terminated sequences of values
|
|
representing characters. Being NULL terminated, they can't
|
|
contain a NULL (ASCII 0).
|
|
<br/><br/>
|
|
The character words in RETRO are built around ASCII, but
|
|
strings can contain UTF8 encoded data if the host platform
|
|
allows. Words like <span class="tt">s:length</span> will return the number of bytes,
|
|
not the number of logical characters in this case.
|
|
<br/><br/>
|
|
<span class="h2">Prefix</span>
|
|
<br/><br/>
|
|
Strings begin with a single <span class="tt">'</span>.
|
|
<br/><br/>
|
|
<tt class='indentedcode'>'Hello</tt>
|
|
<tt class='indentedcode'>'This_is_a_string</tt>
|
|
<tt class='indentedcode'>'This_is_a_much_longer_string_12345_67890_!!!</tt>
|
|
<br/><br/>
|
|
RETRO will replace spaces with underscores. If you need both
|
|
spaces and underscores in a string, escape the underscores and
|
|
use <span class="tt">s:format</span>:
|
|
<br/><br/>
|
|
<tt class='indentedcode'>'This_has_spaces_and_under\_scored_words. s:format</tt>
|
|
<br/><br/>
|
|
<span class="h2">Namespace</span>
|
|
<br/><br/>
|
|
Words operating on strings are in the <span class="tt">s:</span> namespace.
|
|
<br/><br/>
|
|
<span class="h2">Lifetime</span>
|
|
<br/><br/>
|
|
At the interpreter, strings get allocated in a rotating buffer.
|
|
This is used by the words operating on strings, so if you need
|
|
to keep them around, use <span class="tt">s:keep</span> or <span class="tt">s:copy</span> to move them to
|
|
more permanent storage.
|
|
<br/><br/>
|
|
In a definition, the string is compiled inline and so is in
|
|
permanent memory.
|
|
<br/><br/>
|
|
You can manually manage the string lifetime by using <span class="tt">s:keep</span>
|
|
to place it into permanent memory or <span class="tt">s:temp</span> to copy it to
|
|
the rotating buffer.
|
|
<br/><br/>
|
|
<span class="h2">Mutability</span>
|
|
<br/><br/>
|
|
Strings are mutable. If you need to ensure that a string is
|
|
not altered, make a copy before operating on it or see the
|
|
individual glossary entries for notes on words that may do
|
|
this automatically.
|
|
<br/><br/>
|
|
<span class="h2">Searching</span>
|
|
<br/><br/>
|
|
RETRO provides four words for searching within a string.
|
|
<br/><br/>
|
|
• <span class="tt">s:contains-char?</span> <br/>
|
|
• <span class="tt">s:contains-string?</span><br/>
|
|
• <span class="tt">s:index-of</span><br/>
|
|
• <span class="tt">s:index-of-string</span><br/>
|
|
<br/><br/>
|
|
<span class="h2">Comparisons</span>
|
|
<br/><br/>
|
|
• <span class="tt">s:eq?</span><br/>
|
|
• <span class="tt">s:case</span><br/>
|
|
<br/><br/>
|
|
<span class="h2">Extraction</span>
|
|
<br/><br/>
|
|
To obtain a new string containing the first <span class="tt">n</span> characters from
|
|
a source string, use <span class="tt">s:left</span>:
|
|
<br/><br/>
|
|
<tt class='indentedcode'>'Hello_World #5 s:left</tt>
|
|
<br/><br/>
|
|
To obtain a new string containing the last <span class="tt">n</span> characters from
|
|
a source string, use <span class="tt">s:right</span>:
|
|
<br/><br/>
|
|
<tt class='indentedcode'>'Hello_World #5 s:right</tt>
|
|
<br/><br/>
|
|
If you need to extract data from the middle of the string, use
|
|
<span class="tt">s:substr</span>. This takes a string, the offset of the first
|
|
character, and the number of characters to extract.
|
|
<br/><br/>
|
|
<tt class='indentedcode'>'Hello_World #3 #5 s:substr</tt>
|
|
<br/><br/>
|
|
<span class="h2">Joining</span>
|
|
<br/><br/>
|
|
You can use <span class="tt">s:append</span> or <span class="tt">s:prepend</span> to merge two strings.
|
|
<br/><br/>
|
|
<tt class='indentedcode'>'First 'Second s:append</tt>
|
|
<tt class='indentedcode'>'Second 'First s:prepend</tt>
|
|
<br/><br/>
|
|
<span class="h2">Tokenization</span>
|
|
<br/><br/>
|
|
• <span class="tt">s:tokenize</span><br/>
|
|
• <span class="tt">s:tokenize-on-string</span><br/>
|
|
• <span class="tt">s:split</span><br/>
|
|
• <span class="tt">s:split-on-string</span><br/>
|
|
<br/><br/>
|
|
<span class="h2">Conversions</span>
|
|
<br/><br/>
|
|
To convert the case of a string, RETRO provides <span class="tt">s:to-lower</span>
|
|
and <span class="tt">s:to-upper</span>.
|
|
<br/><br/>
|
|
<span class="tt">s:to-number</span> is provided to convert a string to an integer
|
|
value. This has a few limitations:
|
|
<br/><br/>
|
|
• only supports decimal<br/>
|
|
• non-numeric characters will result in incorrect values<br/>
|
|
<br/><br/>
|
|
<span class="h2">Cleanup</span>
|
|
<br/><br/>
|
|
RETRO provides a handful of words for cleaning up strings.
|
|
<br/><br/>
|
|
<span class="tt">s:chop</span> will remove the last character from a string. This
|
|
is done by replacing it with an ASCII:NULL.
|
|
<br/><br/>
|
|
<span class="tt">s:trim</span> removes leading and trailing whitespace from a string.
|
|
For more control, there is also <span class="tt">s:trim-left</span> and <span class="tt">s:trim-right</span>
|
|
which let you trim just the leading or trailing end as desired.
|
|
<br/><br/>
|
|
<span class="h2">Combinators</span>
|
|
<br/><br/>
|
|
• <span class="tt">s:for-each</span><br/>
|
|
• <span class="tt">s:filter</span><br/>
|
|
• <span class="tt">s:map</span><br/>
|
|
<br/><br/>
|
|
<span class="h2">Other</span>
|
|
<br/><br/>
|
|
• <span class="tt">s:evaluate</span><br/>
|
|
• <span class="tt">s:copy</span><br/>
|
|
• <span class="tt">s:reverse</span><br/>
|
|
• <span class="tt">s:hash</span><br/>
|
|
• <span class="tt">s:length</span><br/>
|
|
• <span class="tt">s:replace</span><br/>
|
|
• <span class="tt">s:format</span><br/>
|
|
• <span class="tt">s:empty</span><br/>
|
|
<br/><br/>
|
|
<span class="h2">Controlling The Temporary Buffers</span>
|
|
<br/><br/>
|
|
As dicussed in the Lifetime subsection, temporary strings are
|
|
allocated in a rotating buffer. The details of this can be
|
|
altered by updating two variables.
|
|
<br/><br/>
|
|
<tt class='indentedcode'>| Variable | Holds |</tt>
|
|
<tt class='indentedcode'>| ------------- | ---------------------------------------- |</tt>
|
|
<tt class='indentedcode'>| TempStrings | The number of temporary strings |</tt>
|
|
<tt class='indentedcode'>| TempStringMax | The maximum length of a temporary string |</tt>
|
|
<br/><br/>
|
|
For example, to increase the number of temporary strings to
|
|
48:
|
|
<br/><br/>
|
|
<tt class='indentedcode'>#48 !TempStrings</tt>
|
|
<br/><br/>
|
|
The defaults are:
|
|
<br/><br/>
|
|
<tt class='indentedcode'>| Variable | Default |</tt>
|
|
<tt class='indentedcode'>| ------------- | ------- |</tt>
|
|
<tt class='indentedcode'>| TempStrings | 32 |</tt>
|
|
<tt class='indentedcode'>| TempStringMax | 512 |</tt>
|
|
<br/><br/>
|
|
It's also important to note that altering these will affect
|
|
the memory map for all temporary buffers. Do not use anything
|
|
already in the buffers after updating these or you will risk
|
|
data corruption and possible crashes.
|
|
</p>
|
|
</body></html>
|