advertisement

The transforms in this category manipulate text values.

Transform Description
append Returns the input value with append's parameter concatenated as a suffix/td>
extract Extracts a substring from the value using a Regular Expression pattern to identify the substring
extractIndex Extracts a substring from the value using a Regular Expression pattern to identify the substring. The pattern may match the value multiple times, and if so, you may specify an index number to indicate which matching value you want to extract
numberToWords Converts the field value from digits ("17") to words ("seventeen")
padLeft Pads the current string with a given string (repeated, if needed) so that the resulting string reaches a given length. The padding is applied from the left (start) of the current string
padRight Pads the current string with a given string (repeated, if needed) so that the resulting string reaches a given length. The padding is applied to the right (end) of the current string
plural Returns one of three text values based on the numeric value of the field value
prepend Returns the input value with prepend's parameter concatenated as a prefix/td>
replace Replaces text that matches either a Regular Expression pattern or a literal value with new text
split Splits the field value into parts using the separator to divide the parts and returns the part indicated by the index number
splitCount Splits the field value into parts using the separator to divide the parts and returns the number of parts
substring Extracts a subset of characters from a value
urlParameter Returns the input value encoded for use in a URL

append:text

Returns the input value with text concatenated as a suffix.

For example, if a field named Place has the value "Boston", the result of [Place:append:, Massachusetts] is "Boston, Massachusetts".

extract:pattern:flags

Extracts a substring from the value using a Regular Expression to identify the substring. The pattern is specified as parameter one, and the flags for the Regular Expression are specified as an optional parameter two. If the flags are not provided, they default to "gi".

You must include one capturing group in the Regular Expression to indicate which text to extract.

For example, if the "Name" field has the value "John 'Jack' Smith", the result of [Name:extract:'([^']+)'] is "Jack".

Regular Expression details:

  • The initial ' (single quote) matches a single quote.
  • The ( starts the capture group. The characters that match the expression within the parentheses will be extracted.
  • The [^']+ expression matches one or more characters that are not single-quotes.
  • The ) ends the capture group.
  • The final ' (single quote) matches a single quote.

There are many resources online for learning about Regular Expressions including RegexOne. There are also many online Regular Expression testers which will help you understand Regular Expressions. I often use regular expressions 101 to test Regular Expressions. On that site, make sure you set the "Flavor" to "ECMAScript (JavaScript)".

extractIndex:pattern:index:flags

Specifying an index number parameter is the main reason to use extractIndex rather than extract.

Extracts a substring from the value using a Regular Expression to identify the substring. The pattern may match the value multiple times, and if so, you may specify an index number to indicate which matching value you want to extract.

The pattern is specified as parameter one, the index is an optional parameter two, and the flags for the Regular Expression are specified as an optional parameter three. If the flags are not provided, they default to "gi". The "g" (global) flag is required and if you do not specify it, ORA will add it.

You must include one capturing group in the Regular Expression to indicate which text to extract.

If index is a positive number, it indicates the index number of the desired match where the first match is 1, the second match is 2, etc.

if index is a negative number, it indicates the index number of the desired match where the last match is -1, the second-to-last is -2, etc.

Examples

  • If a field named Place has the value "Hessen (Hesse), Deutschland (Germany)", the result of [Place:extractIndex:\(([^\)]+)\):1] is "Hesse".
  • If a field named Place has the value "Hessen (Hesse), Deutschland (Germany)", the result of [Place:extractIndex:\(([^\)]+)\):2] is "Germany".
  • If a field named Place has the value "Hessen (Hesse), Deutschland (Germany)", the result of [Place:extractIndex:\(([^\)]+)\):-1] is "Germany".
  • If a field named Place has the value "Hessen (Hesse), Deutschland (Germany)", the result of [Place:extractIndex:\(([^\)]+)\):-2] is "Hesse".

The examples above all use the same Regular Expression pattern, \(([^)]+)\), explained as follows:

  • \( matches a left-parenthesis. The left-parenthesis must be escaped with a back-slash because a left-parenthesis is a reserved character in Regular Expressions.
  • ( starts the capture group. The characters that match the expression within the parentheses will be extracted.
  • [^\)]+ is a character class expression that matches one or more characters that are not a right-parenthesis. The back-slash escapes the special meaning of the right-parenthesis within the character class expression.
  • ) ends the capture group.
  • \) matches a right-parenthesis. The right-parenthesis must be escaped with a back-slash because a right-parenthesis is a reserved character in Regular Expressions.

The Regular Expression above is useful for extracting text that is enclosed in parentheses.

There are many resources online for learning about Regular Expressions including RegexOne. There are also many online Regular Expression testers which will help you understand Regular Expressions. I often use regular expressions 101 to test Regular Expressions. On that site, make sure you set the "Flavor" to "ECMAScript (JavaScript)".

numberToWords

For numeric values less than 999,999, transforms the value from digits to words. For numeric values greater than 999,999, leaves the value as-is. For non-numeric values, returns an empty string.

For example, if a field named Age has the value "7", the result of [Age:numberToWords] is "seven".

padLeft:length:padString

Pads the current string with a given string (repeated, if needed) so that the resulting string reaches a given length. The padding is applied from the left (start) of the current string.

The first parameter is the desired length. It is required. If the value is already equal to or longer than this length, the value is not changed.

The second parameter is the string to add to the left if needed. It is optional and defaults to a single space.

For example, if a field named Enumeration District has the value "12A", the result of [Enumeration District:padLeft:4:0] is "012A".

padRight:length:padString

Pads the current string with a given string (repeated, if needed) so that the resulting string reaches a given length. The padding is applied to the right (end) of the current string.

The first parameter is the desired length. It is required. If the value is already equal to or longer than this length, the value is not changed.

The second parameter is the string to add to the right if needed. It is optional and defaults to a single space.

For example, if a field named Sheet has the value "B", the result of [Sheet:padRight:4:*] is "B***".

plural:one:many:zero

Returns one of three text values based on the numeric value of the given variable. This transform expects up to three options to specify the text values. The first is for a value of 1 or -1, the second is for a value that is not -1, zero, or 1, and the last is for zero. The general form is:

[field-name:plural:one:many:zero]

For example, if a field named Children has the value "2", the result of [Children:plural:one:more than one:zero] is "more than one".

To add an "s" to a word, you could use [Children:plural::s:s].

Plural returns a true result if the input value is a number and there is at least one parameter. If one or more of the three text values are empty, the transform will not append any text to the result, but the result will still be true. This makes it more convenient to use the transform inside conditional expressions.

prepend:text

Returns the input value with text concatenated as a prefix.

For example, if a field named Place has the value "Boston", the result of [Place:prepend:City=] is "City=Boston".

replace:pattern:newText:flags
replace:oldText:newText:options

Replace has two modes of operation, Regular Expression mode and Literal mode. The default is Regular Expression mode. If you include the option character l (ell), replace operates in Literal mode.

  • In Regular Expression mode, replace uses a Regular Expression to identify a substring in the current value and replaces the substring with new text. The pattern is specified as parameter one, newText is optional parameter two, and the flags for the Regular Expression are specified as an optional parameter three. If newText is not provided, it defaults to the empty string. If the flags are not provided, they default to "gi".
  • In Literal mode, replace changes oldText to newText in the current value. The oldText is specified as parameter one, newText is parameter two, and the replacement options are specified as parameter three. You may remove oldText by specifying an empty string for newText. See the Literal Mode Options section below for details.

Regular Expression Mode Examples

  1. Change underscore to comma and space:
    [Occupation:replace:_:, ]

    If a field named Occupation has the value "farmer_blacksmith", the result is "farmer, blacksmith"".

  2. Remove leading zero(es) from a number:
    [Children:replace:^0+]

    If a field named Children has the value "01", the result is "1".

You may use the following special text sequences in the new text:

Pattern Inserts
$$ Inserts a "$".
$n Where n is a positive integer less than 100, inserts the nth captured group.

There are many resources online for learning about Regular Expressions including RegexOne. There are also many online Regular Expression testers which will help you understand Regular Expressions. I often use regular expressions 101 to test Regular Expressions. On that site, make sure you set the "Flavor" to "ECMAScript (JavaScript)".

Literal Mode Examples

  1. Change the "quoted exactly" notation from parentheses to brackets:
    [Occupation:replace:(sic):[sic]:l]

    If a field named Occupation has the value "farmr(sic)", the result is "farmr[sic]".

  2. Change ", USA" to the empty string:
    [Occupation:replace:, USA::l]

    If a field named Location has the value "Augusta, Maine, USA", the result is "Augusta, Maine"".

Literal Mode Options

These are the Literal mode options.

  • You must always specify the l option to enable Literal mode.
  • If you specify "l" only, ORA treats that the same as "lgi" and performs a global, case-insensitive replace operation.
  • If you specify "lg", ORA will replace all occurrences of oldText that match the uppercase and lowercase characters in oldText exactly. The case-insensitive option is disabled.
  • If you specify "li", ORA will replace only the first occurrence of the oldText using a case-insensitive replace operation. The global option is disabled.
  • If you specify "l " (an ell and a space), ORA performs a non-global, non-case-sensitive replace operation. Specifying "l " is how you disable both the global and case-sensitive options.

split:separator:index

Splits the field value into parts using the separator to divide the parts and returns the part indicated by the index number.

If index is a positive number, it indicates the index number of the desired part where the first part is 1, the second part is 2, etc.

if index is a negative number, it indicates the index number of the desired part where the last part is -1, the second-to-last is -2, etc.

Examples

  1. Return the first part from a field value where the parts are separated by a comma:
    [Place:split:,:1]

    If a field named Place has the value "Boston, Suffolk Country, Massachusetts", the result is "Boston".

  2. Return the last part from a field value where the parts are separated by a comma:
    [Place:split:,:-1]

    If a field named Place has the value "Boston, Suffolk Country, Massachusetts", the result is "Massachusetts".

splitCount:separator

Splits the field value into parts using the separator to divide the parts and returns the number of parts.

Example

Return the number of parts in a field value where the parts are separated by a comma:

[Place:splitCount:,]

If a field named Place has the value "Boston, Suffolk Country, Massachusetts", the result is "3".

substring:indexStart:indexEnd

Extracts a subset of characters from a value.

There are several acceptable forms of the parameters. The first parameter, indexStart, is required. The second parameter, indexEnd, is optional. If indexEnd is not specified, it defaults to the end of the value.

Parameter Variations

  • If indexStart is a positive number, it indicates the one-origin index of the first character of the desired subset.

    For example, if a field named Test has the value "123456789", the result of [Test:substring:1:1] is "1".

  • If indexStart is a negative number, it indicates the one-origin index of the first character of the desired subset relative to the end of the value.

    For example, if a field named Test has the value "123456789", the result of [Test:substring:-1] is "9".

  • If indexEnd is a positive number, it indicates the one-origin index of the last character of the desired subset.

    For example, if a field named Test has the value "123456789", the result of [Test:substring:2:3] is "23".

  • If indexEnd is a negative number, it indicates the one-origin index of the last character of the desired subset relative to the end of the value.

    For example, if a field named Test has the value "123456789", the result of [Test:substring:-3, -2] is "78".

  • If indexEnd starts with a plus sign (+), the number that follows indicates the length of the desired subset starting at indexStart.

    For example, if a field named Test has the value "123456789", the result of [Test:substring:3:+2] is "34".

urlParameter

Returns the input value encoded for use in a URL. Use urlParameter to ensure that characters in a field value are encoded properly for use in a URL. Do not do encoding with the replace transform; that is tedious and error-prone.

For example, if a field named Place has the value "Boston, Massachusetts", the result of [Place:urlParameter] is "Boston%2C%20Massachusetts". The comma and space characters in the original value are coverted to %2C and %20, respectively, because the comma and space characters are reserved in URLs and must be encoded.

Web servers may have additional rules for the values passed as parameters and so it may be necessary to use other transforms to massage the value before including it in a URL. It's best to massage the value to meet the rules of the web server and then pass the resulting value to the urlParameter transform to ensure no reserved characters are being used.

Regular Expression Flags

The transforms that accept a Regular Expression as a parameter include an optional parameter to specify flags for the Regular Expression processor.

Flag Description
g Global search; if the pattern matches the input value more than once, all matches will be processed by transforms that support multiple matches.
i Case-insensitive search; a lower-case letter in the value matches an upper-case letter in the pattern, and vice-versa.

The default is "gi". If you omit the parameter, ORA will pass "gi" to the Regular Expression processor.

If you want to disable the default "gi" flags, specify one or more spaces as the parameter value.

There are many resources online for learning about Regular Expressions including RegexOne. There are also many online Regular Expression testers which will help you understand Regular Expressions. I often use regular expressions 101 to test Regular Expressions. On that site, make sure you set the "Flavor" to "ECMAScript (JavaScript)".

On This Page