add s:tokenize-on-string to stdlib

FossilOrigin-Name: fbbc60112e0011614b639c24a061eca41d3924b93f9a8e02c0685243e888c3a9
This commit is contained in:
crc 2017-11-13 13:05:46 +00:00
parent 990066ac08
commit 9e7c27fcc7
5 changed files with 34 additions and 2 deletions

View file

@ -3433,6 +3433,18 @@ Class Handler: class:word | Namespace: {n/a} | Interface Layer: {n/a}
----------------------------------------------------------------
s:tokenize-on-string
Data: ss-a
Addr: -
Float: -
Takes a string (s1) and a substring (s2) use as a separator. It splits the string into a set of substrings and returns a set containing pointers to each of them.
Class Handler: class:word | Namespace: {n/a} | Interface Layer: {n/a}
----------------------------------------------------------------
s:trim
Data: s-s

File diff suppressed because one or more lines are too long

View file

@ -925,6 +925,25 @@ pointers to each of them.
}}
~~~
`s:tokenize-on-string` is like `s:tokenize`, but for strings.
~~~
{{
'Tokens var
'Needle var
:-match? (s-sf) dup @Needle s:contains-string? ;
:save-token (s-s) @Needle s:split-on-string s:keep buffer:add n:inc ;
:tokens-to-set (-a) here @Tokens buffer:size dup , [ fetch-next , ] times drop ;
---reveal---
:s:tokenize-on-string (ss-a)
[ s:keep !Needle here #8192 + !Tokens
@Tokens buffer:set
[ repeat -match? 0; drop save-token again ] call s:keep buffer:add
tokens-to-set ] buffer:preserve ;
}}
~~~
Ok, This is a bit of a hack, but very useful at times.
Assume you have a bunch of values:

BIN
ngaImage

Binary file not shown.

View file

@ -277,6 +277,7 @@ s:to-lower s-s - - Convert uppercase ASCII characters in a string to lowercase.
s:to-number s-n - - Convert a string to a number. class:word {n/a} {n/a} s all
s:to-upper s-s - - Convert lowercase ASCII characters in a string to uppercase. class:word {n/a} {n/a} s all
s:tokenize sc-a - - Takes a string and a character to use as a separator. It splits the string into a set of substrings and returns a set containing pointers to each of them. class:word {n/a} {n/a} {n/a} {n/a}
s:tokenize-on-string ss-a - - Takes a string (s1) and a substring (s2) use as a separator. It splits the string into a set of substrings and returns a set containing pointers to each of them. class:word {n/a} {n/a} {n/a} {n/a}
s:trim s-s - - Trim leading and trailing whitespace from a string. class:word {n/a} {n/a} s all
s:trim-left s-s - - Trim leading whitespace from a string. class:word {n/a} {n/a} s all
s:trim-right s-s - - Trim trailing whitespace from a string. class:word {n/a} {n/a} s all

1 * nn-n - - Multiply `n1` by `n2` and return the result. class:primitive {n/a} {n/a} global all
277 s:to-number s-n - - Convert a string to a number. class:word {n/a} {n/a} s all
278 s:to-upper s-s - - Convert lowercase ASCII characters in a string to uppercase. class:word {n/a} {n/a} s all
279 s:tokenize sc-a - - Takes a string and a character to use as a separator. It splits the string into a set of substrings and returns a set containing pointers to each of them. class:word {n/a} {n/a} {n/a} {n/a}
280 s:tokenize-on-string ss-a - - Takes a string (s1) and a substring (s2) use as a separator. It splits the string into a set of substrings and returns a set containing pointers to each of them. class:word {n/a} {n/a} {n/a} {n/a}
281 s:trim s-s - - Trim leading and trailing whitespace from a string. class:word {n/a} {n/a} s all
282 s:trim-left s-s - - Trim leading whitespace from a string. class:word {n/a} {n/a} s all
283 s:trim-right s-s - - Trim trailing whitespace from a string. class:word {n/a} {n/a} s all