Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Text computed column functions

This section provides syntax and examples for the text computed column functions you can use with the Data Prep Compute tool.

CHAR

Returns the character for the specified ASCII value.

Syntax

CHAR(INT)

INT is the ASCII value for the character to return.

Example

CHAR(ASCII)

CONCATENATE

To concatenate a series of text strings into a single text string, you can use the + operator. This example combines two columns with a hyphen between them:

@Last@ +"-" + @First@

Alternatively, you can use the CONCATENATE function:

CONCATENATE(@Last@ , "-" , @First@ )

Syntax

CONCATENATE(STRING_1, [STRING_2, ...])

  • STRING_1 is the first value.
  • STRING_2, ... [optional] are the additional strings.

Example

CONCATENATE(@Applicant Last@ ,", ",@Applicant First@ , " of ",@City@)

Notes on use

The STRING you provide can be a text string or numeric value, a column that contains a text string or numeric value, or a function that returns a text string or numeric value.

FIND

Determines if one word (or string of text) can be found in a second piece of text. If found, the FIND function returns the numeric position of the text string within the second text string. The characters in the second string are counted and the number indicates the character where the first piece of text begins its overlap in the second string of text.

This function allows for an optional third argument, a number. This number indicates the position (by number of characters) where you want the search to begin in the second string. If the third argument is omitted, then the second string is searched beginning at the first character.

If the first string is not found in the second string, then the function returns 0.

Syntax

FIND(STRING_1, STRING_2, [VALUE])

  • STRING_1 is the string you want to find.
  • STRING_2 is the string you want to search in.
  • VALUE, optional, is the numeric position in STRING_2 you want to start your search.

Examples

FIND("Tech",@School@)

FIND("the", "The quick sly fox jumped over the lazy brown dog laying next to the other dog.")returns a value of 31.

FIND("dog", "The quick sly fox jumped over the lazy brown dog laying next to the other dog.") returns a value of 46, which corresponds to the first occurrence of “dog” in the second string.

FIND("dog", "The quick sly fox jumped over the lazy brown dog laying next to the other dog.", 47) returns a value of 75 because the third argument value of 47 pushes the start of the search past character 46 (where the first “dog” occurs) and forces the function to find the second “dog” in the string.

Notes on use

The STRING you provide must be a text string, a column that contains a text string, or a function that returns a text string. Likewise, the VALUE must be a numeric value, a column that contains a numeric value, or a function that returns a numeric value.

If STRING_1 occurs multiple times in STRING_2, FIND only indicates the position of the first match—not successive matches in the pair.

The FIND function is case sensitive, so it treats True, TRUE, and true separately.

This function matches pieces of text—not just words. Therefore, the text “jump” will be determined to be in the string “jumped” at position 1.

Text characters, not just words, can be used as search strings and discovered in the second string.

HASHVALUE

Transforms a text string to make fuzzy matching easier.

Syntax

HASHVALUE(STRING, OPTION, [VALUE])

  • STRING is the string you want to transform.
  • OPTION is the algorithm to use for the transformation. Available options are: METAPHONE
  • NGRAM FINGERPRINT
  • VALUE, used with NGRAM, specifies the number of ngrams to use.

Example

HASHVALUE(@Current Employer@, "metaphone")

Notes on use

The STRING you provide must be a text string, a column that contains text strings, or a function that returns a text string. Both OPTION and VALUE are treated as strings and must be surrounded by quotation marks, i.e. "metaphone."

HASHVALUE uses algorithms to generate hashes based on provided string values. The algorithms used are also used by the Cluster + Edit column operation to find close matches between values within a column. See Cluster + Edit for more information on METAPHONE, NGRAM, and FINGERPRINT.

LEFT

Returns a given number of characters starting from the left-most (beginning) position of a text string.

Syntax

LEFT(STRING, VALUE)

  • STRING is the string you want to search.
  • VALUE is how many characters to return. The default is 1.

Example

LEFT(@School@,4)

Notes on use

The STRING you provide must be a text string, a column that contains text strings, or a function that returns a text string.

LEN

Counts the number of characters in a text string.

Syntax

LEN(STRING)

STRING is the text string you want to evaluate.

Example

LEN(@School@)

Notes on use

The STRING you provide must be a text string, a column that contains text strings, or a function that returns a text string.

LOWER

Converts text in column to all lowercase.

Syntax

LOWER(STRING, LOCALE)

  • STRING is the string or column that you want to convert to lowercase.
  • LOCALE (optional) is the locale, which may need to be specified in order to output required characters for the lowercase.

Refer to https://www.oracle.com/java/technologies/javase/jdk8-jre8-suported-locales.html for the supported locale values.

Example

LOWER(@Values@, "tr")

MID

Returns a given number of characters from the middle of a text sting.

Syntax

MID(STRING, VALUE_1, VALUE_2)

  • STRING is the text string you want to evaluate.
  • VALUE_1 is the start position.
  • VALUE_2 is the number of characters to return.

Example:

MID(@School@,4, 5)

Notes on use

The STRING you provide must be a text string, a column that contains text strings, or a function that returns a text string. The Values provided must be a numeric value, a column that contains numeric values, or a function that returns a numeric value.

PADLEFT

Pads a string with a specified character, for the specified number of times. This provides same output as MySQL LPAD.

Syntax

PADLEFT(STRING, NUMBER, VALUE)

  • STRING or column is the value to pad.
  • NUMBER is the number of times to replace with the VALUE.
  • VALUE is the literal replacement value.

Example

PADLEFT(@set@, 10, "-")

PADRIGHT

Pads a string with a specified character, for the specified number of times. This provides same output as MySQL LPAD and RPAD.

Syntax

PADRIGHT(STRING, NUMBER, VALUE)

  • STRING or column is the value to pad.
  • NUMBER is the number of times to replace with the VALUE.
  • VALUE is the literal replacement value.

Example

PADRIGHT(@set@, 10, "-")

REGEXP

Executes a search and replace on a text string using regular expressions. This function is based on Java Regex.

Tip

To determine if a string of text appears in another piece of text without conducting any replacements, see FIND. The FIND function has the advantage of a slightly more usable syntax—however, the trade-off is that FIND is slightly less powerful in pattern matching.

Syntax

REGEXP(STRING_1, STRING_2, STRING_3)

  • STRING_1 is the text string you want to search.
  • STRING_2 is the text you are searching for.
  • STRING_3 is the text you want to replace STRING_2.

The three arguments are required. STRING_1 must be a text string, a column that contains text strings, or a function that returns a text string. STRING_2 and STRING_3 are composed of character combinations that define the search and replace activity.

Note

In regular expressions, there are 12 characters with special meanings:
\ ^ $ . | ? * + ( ) [ and open curly brace.
If you want to search for these actual characters and not their special meanings, add a double backslash (not a single backslash) before it. For example, to search for asterisk characters with a regular expression, type "\" not "". To search for a backslash character with a regular expression, type four backslash characters.

Examples

Convert a space character to an underscore

REGEXP(@School@," ", "_")

Replace a text string with another textstring

REGEXP(@ProductID@ , "ABC", "DEF")

Convert a slash to a hyphen

REGEXP("The/quick/sly/fox." , "/", "-") returns The-quick-sly-fox.

Convert a backslash (special character) to a hyphen

REGEXP(@ProductID@ , "\\\\", "-")

To convert an asterisk (special character) to a hyphen

REGEXP(@ProductID@ , "\\*", "-")

Delete characters from Column 1 that are not numbers

REGEXP(_column1_ ,"[^0-9]", "")

Examples of Extract and Replace patterns

Command Returns
RegexpExtract("replace me", "e m") "e m"
RegexpExtract("replace me", "e.?m") "e m"
RegexpExtract("replace me", "r.*c") "replac"
RegexpExtract("123123456789", "(123)+456(.*)") "123123456789"
RegexpExtract("123123456789", "(123)+456(.*)", 0) "123123456789"
RegexpExtract("123123456789", "(123)+456(.*)", 1) "123"
RegexpExtract("123123456789", "(123)+456(.*)", 2) "789"
RegexpExtract("456789", "(123)*456(.*)", 2) "789"
RegexpReplace("replace me", "e m", "---") "replac---e"
RegexpReplace("replace me", "e.?m", "---") "replac---e"
RegexpReplace("replace me", "r.*c", "--") "--e me"
RegexpReplace("123123456789", "(123)+456(.*)", "---") "---"
RegexpReplace("123123456789", "abc", "---") "123123456789"

Notes on use

For more guidance on Regex pattern matching, refer to

https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html

REPEAT

Repeats a specified string N number of times.

Syntax

REPEAT(VALUE,REPEAT)

  • VALUE is the string or column to locate and repeat.
  • REPEAT is the number of times to repeat the VALUE.

Example

REPEAT(@set4@, 3)

REPLACE

Replaces part of a text string, based on the number of characters you specify, with a different text string.

Syntax

REPLACE(VALUE, START NUM,NUM CHARS, NEW VALUE)

  • VALUE is the text or the column in which you want to replace characters.
  • START NUM is the start position of the character in the VALUE that you want to replace.
  • NUM CHARS is the number of characters in the text that you want to replace with the new string.
  • NEW VALUE is the replacement value. Note this is case sensitive.

Example

REPLACE(@timestamp@,10,5," ")

Notes on use

Use REPLACE when you want to replace any text that occurs in a specific location in a text string; use SUBSTITUTE when you want to replace specific text in a text string. For example: REPLACE(@Hospital Name@, Search(@Hospital Name@,"ADVOCATE"), 8, "ALPHA")

REVERSE

Reverses the specified string.

Syntax

REVERSE(STRING)

STRING is the column's value or string to reverse.

Example

REVERSE(@set4@)

Returns a give number of characters starting from the right-most (end) position of a text string.

Syntax

RIGHT(STRING, VALUE)

  • STRING is the string to search.
  • VALUE is how many characters to return. The default is 1.

Example

RIGHT(@School@,4)

Notes on use

The STRING you provide must be a text string, a column that contains text strings, or a function that returns a text string.

Searches for a specified string and returns the index of the string. If not found, returns value of -1.

Syntax

SEARCH(VALUE, STRING)

  • VALUE is the text or the column in which you want to substitute characters
  • STRING is the string to search.

Example

SEARCH(@Hospital Name@, "ADVENTIST")

Notes on use

SEARCH can be combined with REPLACE.

Example

REPLACE(@Hospital Name@, Search(@Hospital Name@,"ADVOCATE"), 8, "ALPHA")

STR

Converts the data in the argument into a text string.

Syntax

STR(VALUE)

VALUE is the value you want to convert to a text string.

Example

STR(@Date@)

Notes on use

The Value you provide must be a numeric value, a column that contains a numeric value, or a function that returns a numeric value.

The STR function is useful for converting a numeric value into text or for ensuring that a column of mixed text and number values is treated entirely as a column of text so that other text functions can successfully be executed against it.

SUBSTITUTE

Substitutes new text for old text in a text string.

Syntax

SUBSTITUTE(VALUE, OLD TEXT, NEW TEXT)

  • VALUE is the text or the column in which you want to substitute characters.
  • OLD TEXT is the text you want to replace. Note this is case sensitive.
  • NEW TEXT is the text you want to use to replace OLD TEXT. Note this is case- sensitive.

Example

SUBSTITUTE(@Hospital Name@ ,"CREIGHTON","Merton")

Notes on use

Use SUBSTITUTE when you want to replace specific text in a text string; use REPLACE when you want to replace any text that occurs in a specific location in a text string.

TRIM

Removes all leading and trailing spaces for the specified string.

Note

The TRIM function was designed to trim the 7-bit ASCII space character (value 32) from text. In the Unicode character set, there is an additional space character called the nonbreaking space character that has a decimal value of 160. This character is commonly used in Web pages as the HTML entity,  . By itself, the TRIM function does not remove this nonbreaking space character.

Syntax

TRIM(STRING)

STRING is the value you want to you want trimmed.

The column can be specified as the STRING value as in the following example.

Example

TRIM(@Company@)

TRIMLEFT

Returns the string stripped of whitespace from the left end of the string.

Syntax

TRIMLEFT(STRING)

STRING is the column's values you want to you want trimmed.

Example

TRIMLEFT(@Company@)

TRIMRIGHT

Returns the string stripped of whitespace from the right end of the string.

Syntax

TRIMRIGHT(STRING)

STRING is the column's values you want to you want trimmed.

Example

TRIMRIGHT(@Company@)

UPPER

Converts text in column to all uppercase.

Syntax

UPPER(STRING,LOCALE)

  • STRING is the string or column that you want to convert to uppercase.
  • LOCALE (optional) is the locale, which may need to be specified in order to output required characters for the uppercase.

Refer to https://www.oracle.com/java/technologies/javase/jdk8-jre8-suported-locales.html for the supported locale values.

Example

UPPER(@Values@, "tr")

VALUE

Converts numbers stored as a string value into a numeric value.

Syntax

VALUE(STRING)

STRING is the numbers, stored as a text string, you want to convert to a numeric value.

Example

VALUE(@COLUMN@)

Notes on use

The STRING you provide must be a number stored as text string, a column that contains a number stored as text string, or a function that returns a number stored as text string.

If STRING contains characters other than numbers, the functions returns an error. A single period (decimal point) is allowed within the argument in order to create a real number.

The VALUE function is useful for converting a text value into numbers to ensure that a column of number values is treated as a column of numbers so that number based functions can successfully be executed against it.


Updated October 28, 2021
Back to top