Resistance is futile. Rebol is fun.: Collapse and Capture a Repeating Pattern

This particular Stackoverflow question shows the ease of use of parse over regex. (It was so simple with parse that this post took me about 5 minutes to write :-)

This is what the OP says:

I keep bumping into situations where I need to capture a number of tokens from a string and after countless tries I couldn't find a way to simplify the process.

This is what the input looks like:

start:test-test-lorem-ipsum-sir-doloret-etc-etc-something:end

view raw string hosted with ❤ by GitHub

How does parse deal with it?

Simple. 'parse <string> none' will let you split a string by the spaces. Similarly, 'parse/all <string> "x"' will let you split a string by the character "x", whatever it might be. You can even use more characters, like "xyz", and the string will be split by them. See:

	>> parse "Off-the-charts On-the-asphalt" "-" ;; No parse/all here
	== ["Off" "the" "charts" "On" "the" "asphalt"]
	>> parse/all "Off-the-charts On-the-asphalt" "-"
	== ["Off" "the" "charts On" "the" "asphalt"]
	>> parse/all "Off-the-charts On-the-asphalt" "- " ;; Single dash and space
	== ["Off" "the" "charts" "On" "the" "asphalt"]

view raw parse and parse-all hosted with ❤ by GitHub

This is how we can get strings from the given situation using parse:

	divide-string-by-dash: func [str [string!]] [
	return (parse/all str "-")
	]

view raw dividing-by-dash hosted with ❤ by GitHub

Usage:

	>> divide-string-by-dash "Long-live-regexps"
	== ["Long" "live" "regexps"]

view raw usage hosted with ❤ by GitHub

OP's string

Now, the OP's string starts with "start:" and ends with ":end", and we need to factor that. There are many ways of doing this. Different people will do it differently, but this is what I'll do:

	divide-string-by-dash: func [str [string!]] [
	str-block: parse/all str ":" ;; result: ["start" "<string as needed>" "end"]
	return (parse/all (second str-block) "-")
	]

view raw divide-string-by-dash hosted with ❤ by GitHub

You can use a single line for lines 2 and 3. Like:

	divide-string-by-dash: func [str [string!]] [
	return (parse/all (second (parse/all str ":")) "-")
	]

view raw divide-string-by-dash hosted with ❤ by GitHub

Here is another version if you don't like too many parenthesis:

	divide-string-by-dash: func [str [string!]] [
	return parse/all second parse/all str ":" "-"
	]

view raw divide-string-by-dash hosted with ❤ by GitHub

Usage (same for all 3 versions of the function):

	>> divide-string-by-dash "start:test-test-lorem-ipsum-sir-doloret-etc-etc-something:end"
	== ["test" "test" "lorem" "ipsum" "sir" "doloret" "etc" "etc" "something"]

view raw usage hosted with ❤ by GitHub

Feel free to ask or say anything.

Resistance is futile. Rebol is fun.

Pages

Friday, 13 December 2013

Collapse and Capture a Repeating Pattern

How does parse deal with it?

OP's string

No comments:

Post a Comment

About Me