PARSE defaults to breaking off tokens at blank spaces.

If you wish the value to be tokenized at characters other than a blank space, you can specify that as a literal string (we'll call the search string) after the token variable name. For example, assume that My_Variable's value is the string This is some data; separated by a semi-colon, and then a comma. You wish to break this up into 3 tokens, separating the first token at the semi-colon (ie, This is some data), and the second token at the comma (ie, separated by a semi-colon), with the last token being and then a comma.

Here's how we do that:

/* Parse My_Variable into 3 tokens, broken at ; and , */
My_Variable = "This is some data;   separated by a semi-colon, and then a comma"
PARSE VAR My_Variable token.1 ';' token.2 ',' token.3
SAY "token.1 = '" || token.1 || "'"
SAY "token.2 = '" || token.2 || "'"
SAY "token.3 = '" || token.3 || "'"
Note that since we aren't tokenizing by blank space, REXX doesn't trim the leading spaces off of the second token. You'd have to use the STRIP() function to subsequently do that.

You can token by entire strings (ie, patterns). For example, here we break apart the first token at the string some, and the second token at the string more.

/* Parse My_Variable at the string "some" and "more" */
My_Variable = "This is some data; and this is more"
PARSE VAR My_Variable token.1 "some" token.2 "more" token.3
DO i=1 TO 3
   SAY "token."i "= '"token.i"'"
END
Note that the search strings some and more are completely removed from the tokens. If some did not appear in My_Variable at all, then the first token would contain the entire string, and the second and third tokens would be empty strings. If some appeared, but more did not, then the first token would be "This is ", the second token would be the remaining data, and the third token would be an empty string.

It's possible to tokenise the source data appearing inbetween search strings. For example:

/* Parse My_Variable between "some" and "more" */
My_Variable = "This is some data; and this is more"
PARSE VAR My_Variable token.1 "some" token.1 token.2 token.3 "more"
DO i=1 TO 3
   SAY "token."i "= '"token.i"'"
END
The above tokenises everything between some and more and places the tokens in token.1 (ie, "data;"), token.2 (ie, "and"), and token.3 (ie, "this is ").


A variable's value as a search string

You can also use the value of a variable to contain the search string. You put the name of the variable in parentheses where you would otherwise put the search string:

/* Parse My_Variable between "some" and "more" */
My_Variable = "This is some data; and this is more"
breakoff1 = "some"
breakoff2 = "more"
PARSE VAR My_Variable token.1 (breakoff1) token.1 token.2 token.3 (breakoff2)
DO i=1 TO 3
   SAY "token."i "= '"token.i"'"
END

Case-insensitive search strings

Consider the following:

/* Break apart My_Variable at the search string "and" */
My_Variable = "Jack And Jill"
PARSE VAR My_Variable firstName 'and' secondName
SAY "firstName = '" || firstName || "'"
SAY "secondName = '" || secondName || "'"
You'll notice that the first name is Jack And Jill, and the second name is an empty string. Why? Because normally, REXX considers search strings to be case-sensitive. You'll notice that above, I specified the search string and. But in My_Variable, I used the word And (ie, begins with an upper-case A). To REXX, And is not the same as the search string of and. Therefore, REXX finds no match, and all of the parsed text ends up being placed into the variable firstname.

There are two approaches to solving this dilemma. You can either use the UPPER, or the CASELESS keywords.

UPPER causes all of the text to be upper-cased before it is parsed. So, if you make all of your search strings upper-case, then you'll get a match, as so:

/* Break apart My_Variable at the search string "AND". Use UPPER keyword */
My_Variable = "Jack And Jill"
PARSE UPPER VAR My_Variable firstName 'AND' secondName
SAY "firstName = '" || firstName || "'"
SAY "secondName = '" || secondName || "'"
Now, firstName will be JACK and secondName be JILL (preceded by a leading space). One caveat with this is that you lose the original case. (ie, All the tokens end up being upper-cased). This may or may not be desireable.

If you wish to retain the original case, an alternative is to use the CASELESS keyword. Another advantage is that your search strings will be considered case-insensitive. In other words, a search string of and will match And, AND, or any other permutation.

/* Break apart My_Variable at the search string "AND". Use CASELESS keyword */
My_Variable = "Jack And Jill"
PARSE CASELESS VAR My_Variable firstName 'AND' secondName
SAY "firstName = '" || firstName || "'"
SAY "secondName = '" || secondName || "'"
Now, firstName will be Jack and secondName be Jill (preceded by a leading space).