Friday 13 December 2013

Replacing punctuation with nothing

Time and again we get into situations where we need to remove certain characters from a string. Like this StackOverflow question. Needless to say, rebol makes this easy and intuitive.

Why only punctuation?

For this discussion, punctuation is something like a tilde or a needless comma. We can also remove other things, like the word ji that us Indians and the word san that the Japanese like to use after people's names. Or some spelling mistake all across a string for that purpose. Read on.

replace/all

If there is only one word or character that we have to replace, we can use replace/all. We can replace either with an empty string or with some other word or character. Like:

Removing stuff

We will create a function that will take two inputs: a string! and a block!. The block will contain punctuation that we want to remove as strings. For each string in the block, we will apply replace/all on the string. See:
See forall if you are confused about lines 5 and 6. (You can have a look at forskip too.)

Usage:

If you want, you can replace with some other string. Or replace with a string, or a block of strings as arguments.

If there is anything you wish to say or ask, feel free to use the comments. Also, feel free to tell a better way of doing it. :-)

Collapse and Capture a Repeating Pattern

This particular Stackoverflow question shows the ease of use of parse over regex. (It was so simple with parse that this post took me about 5 minutes to write :-) 

This is what the OP says:
I keep bumping into situations where I need to capture a number of tokens from a string and after countless tries I couldn't find a way to simplify the process.
This is what the input looks like:

How does parse deal with it? 

Simple. 'parse <string> none' will let you split a string by the spaces. Similarly, 'parse/all <string> "x"' will let you split a string by the character "x", whatever it might be. You can even use more characters, like "xyz", and the string will be split by them. See:

This is how we can get strings from the given situation using parse:

Usage:

OP's string

Now, the OP's string starts with "start:" and ends with ":end", and we need to factor that. There are many ways of doing this. Different people will do it differently, but this is what I'll do:

You can use a single line for lines 2 and 3. Like:

Here is another version if you don't like too many parenthesis:

Usage (same for all 3 versions of the function):

Feel free to ask or say anything.

Thursday 12 December 2013

Count string occurrence in string

This is from a javascript question on StackOverflow.

You just have to see rebolek's answer here to know what we are doing. The function below is the same with minor modifications.

Code:

Usage:

Keep in mind, though that this code will not work when you have case-sensitivity as a concern. Also, repeated strings will be an issue (e.g. In "aaa", you will get 1 occurrence of "aa" instead of 2).

Wednesday 11 December 2013

Replacing multiple spaces with single space

I am considering the highest voted regular expression and string questions at Stackoverflow to show parse usage in rebol2. It goes without saying that these questions are for other languages than rebol.

First question: Replacing multiple spaces with single space.

How to do this?

  1. Split the string with respect to spaces, and get the result in a block.
  2. Append the strings in the block one by one, with a single space in between.

1. Splitting a string

Splitting a string using parse is simple. Just provide 'none' as the parse parameter. It automatically gives the result in a block

Code:

2. Appending strings in block with spaces

Rebol has forall and append for us. In this example, we use "-" as a divider between words.

Code:

Function:

Now we connect the two parts. 

Code:

Line 2 divides the string by spaces and puts it in a block (no-space-block). Line 3 initializes an empty string. Line 5 make sure that return-string is initialized with the first word. Line 6 ensures that the first word is not processed in the loop later.

Usage:

Feel free to ask anything in comments.