Expression: String Functions

These functions allow string manipulation.

SubStr

Returns a portion of a string:

 

SubStr( String, Start, Size)

 

Start is a zero based offset.

 

if variable Surname = "Pants" in current record

SubStr([Surname],1,3) returns "ant"

SubStr([Surname],0,3) returns "Pan"

 

AddStr

Concatenates strings together:

 

AddStr([Text],[Text],[Text],…)

AddStr("Dear ",[Title]," ",[Initial]," ",[Surname])

 

AddStr([Title],” “,[Initial],” “,[Surname]) might return “Mr J Smith”.

 

FormatNumber

Formats a number to the given precision:

 

FormatNumber(n,ndp, [use separator])

 

Formats n, where ndp is the number of decimal places and 'use separator' denotes whether to display thousand separators with "T" or "F" as valid values.

Example - if [Cost] is 56.78

AddStr("your bill is £",FormatNumber([Cost],2)) is "Your bill is £56.78"

AddStr(“Your rounded bill is £”,FormatNumber(Round([Cost]),0) is “Your rounded bill is £56”

 

Example - FormatNumber([Cost]*1000,0,"T") to calculate and insert thousand separators:

 

StrCompare

Compares two strings:

 

StrCompare( string, string , case_sensitive)

 

Case_sensitive is an optional numeric parameter. If case_sensitive evaluates to 1 then the comparison is case sensitive.  If case_sensitive evaluates to 0 or missing_value or is not provided then the comparison is not case sensitive.   

 

StrCompare( string1, string2) compares two strings alphabetically and returns:

1 if [string1] >  [string2]

-1 if [string1] <  [string2]

0 if [string1] =  [string2]

 

StrLength

Gets length of a string:

 

StrLength( string)

 

StrLength ("Apteco") = 6

 

StrLower

Converts a string to lower case:

 

StrLower( string)

 

StrLower("Apteco") = "apteco"

 

StrUpper

Converts a string to upper case:

 

StrUpper( string)

 

StrUpper("Apteco") = "APTECO"

 

StrProper

 

Converts a string to proper case:

 

StrProper( string)

 

StrProper case capitalises the first letter after a space or punctuation character.  

 

 StrProper (“redfern house”) = “Redfern House”

 

StrContains

Finds substrings within a string:

 

StrContains( x, a, b, c, ...)

 

StrContains searches string x for substrings a, b, c, etc., and returns the index of the first match found, or zero if none are found.

 

StrContains(“Huddersfield Town”,”City”,”Town”) = 2

StrContains(“Smith”,”Smith”,”Jones”,”Walker”) = 1

StrContains(“Huddersfield Town”,”udders”,”cow”,”field”) = 1

StrContains(“Huddersfield Town”,”Udders”,”Cow”,”Field”) = 0

StrContains(“[email]”,”hotmail”,”yahoo”,”aol”,”ntlworld”) detects if the email field has any of the listed ISPs.

 

StrContains|

Case insensitive - finds substrings within a string:

 

StrContainsI (s, substring1, substring2, ...)

 

Searches string s for substrings substring1, substring2, etc. and returns the index of the first match found, or returns zero if no match is found. Each parameter is a string or a textlist.

 

StrFind

Locates a substring within a string:

 

StrFind( search_string, target_string)

 

Returns the zero based start offset of the substring or -1 if not found.

 

StrFind (“Huddersfield Town”,”Town”) = 13

StrFind (“Huddersfield Town”,”Hudd”) = 0

StrFind (“Huddersfield Town”,”Udders”) = -1

 

Note StrFind is case sensitive.  

 

StrFind|

Case insensitive. Locates a substring within a string:

 

StrFind|(Target, Search)

 

Returns the zero based start offset of the substring or -1 if not found.

 

StrReverse

Reverses a string:

 

StrReverse( string)

 

StrReverse("Apteco") = "ocetpA"

 

Left

Left part of a string:

 

Left( string, n)

 

Left(“Red Nose Day”,6) = “Red No”

 

Right

Right part of a string:

 

Right( string, n )

 

Right(“Red Nose Day”,3) = “Day”

 

Mid

Middle part of a string:

 

Mid( string, start, length n)

 

Extracts n characters from the string. The start position is 1-based.

 

Mid(“Red Nose Day”,5,4) = “Nose”

 

LeftTrim

Removes leading spaces from string:

 

LeftTrim(" text ")

 

LeftTrim(" Apteco ") = "Apteco "

 

Trim

Removes all spaces except for leaving a single space between words:

 

Trim(String)

 

Trim(" James Alty   Apteco") = "James Alty Apteco"

 

StrClean

Removes all characters from string 1 that are listed in string 2 (case sensitive):

 

StrClean(String 1, String 2)

 

StrClean("Apteco","e") = "Aptoc". StrClean([Postcode]," ") = "CV344AP"

 

StrReplace

 

Replaces all occurrences of A with B in string X (case sensitive):

 

StrReplace(string X, string A, string B)

 

StrReplace("Apteco","A","a") = "apteco"

 

StrBegins

Return index of the first match:

 

StrBegins (X, A, B, C, ...) If string X begins A or B or C or ...

 

Each of the search strings is a string or a textlist.

 

StrEnds

Returns index of the first match:

 

StrEnds (X, A, B, C, ...) If string X ends A or B or C or ...

 

Each of the search strings is a string or a textlist.

 

StrBegins|

Case insensitive. Return index of the first match:

 

StrBegins (X, A, B, C, ...) If string X begins A or B or C or ...

 

Each of the search strings is a string or a textlist.

 

StrEnds|

Case insensitive. Returns index of the first match:

 

StrEnds (X, A, B, C, ...) If string X ends A or B or C or ...

 

Each of the search strings is a string or a textlist.

 

StrListItem

Selects the Nth item from a delimited list as a string:

 

StrListItem (text,delimiters,N)

 

e.g. StrListItem("Cheese|Cake|Biscuit|Banana","|",3) -> "Biscuit"

 

StrDeleteRepeats

Removes duplicate characters from a string:

 

StrDeleteRepeats(text)

 

e.g. StrDeleteRepeats("AAAABBCCCABB") -> "ABCAB"

 

StrScore

Scores a string by a function with values provided in the file;

 

StrScore(text, text, text)

 

Valid function strings are 'sum', 'mean', 'min', 'max'. 'Sum|All' will count all occurrences of the substring.

 

Telephone

Converts a text entry into a number which is compatible with TPS 1.2 "Without Spaces" format.

See: https://corporate.tpsonline.org.uk/index.php/tps/technical

This UK specific option will also remove any 44 prefix from the number.

 

The expression format is as follows:

Telephone([text],"options")

where text = a text variable and options = TPS, which is the only option at present.

 

Example: Telephone([Telephone],"TPS")

 

StrShred

Converts a text entry into a delimited, alphabetical list of the constituent words.

The expression format is as follows:

StrShred(Text,MinSize,MaxSize,Case,Delimiter)

The min and max size will determine the number of letters in the words that are returned.

 

Example: StrShed([Address],4,20,"Proper",",")

 

StrShredContains

Converts a text entry into a delimited list, as per StrShred expression, and determine which words to search for. This will return a 1 for a match on the first word searched for, 2 on the second etc. and a 0 if nothing matches.

The expression format is as follows:

StrShredContains(Text,MinSize,MaxSize,Case,String1,String2...)

where String1, String2 etc. are the words you are searching for.

 

Example: StrShredContains([Address],4,20,"Proper","Close","Road")

 

StrStreak

Allows you to return the first characters of a string until there is a change in characters.

 

e.g. StrStreak(“abbccdde”) will return a

e.g StrStreak(“bbbaccdd”) will return bbb

 

StrHash

Allows you to take a string and apply one of a number of algorithms that will encrypt the string so that it is secure when passed over a network.

 

e.g. StrHash([Email Address],”SHA256”)

 

StrCount

Counts the number of times a target string appears in a search string. This action is case sensitive and works only for the non-overlapping occurrences.

 

e.g. StrCount("banana","an") will return a value of 2

 

FromAscii

Provides the string representation of the Ascii code specified.

 

e.g. FromAscii(65)) will return the string A

 

ToAscii

Provides the code of the Ascii value specified.

 

e.g. ToAscii("A") will return the code 65

 

Note: If the string value used is longer than 1 character, this function will return "Missing Value"

 

StrNumber

Converts a string value to its numeric (double) representation. If the input text does not represent a number, StrNumber exaluates to a missing value.

 

StrNumber(String)

 

StrNumber("123")= 123

 

VarValueText

Returns the text of the text variable at the given record number:

 

VarValueText("variablename","recordNumber")

 

e.g. VarValueText("peEmail",2)

 

Combining String functions with Missing Values functions and Logical functions can create powerful expressions. The following example creates a salutation, like 'Dear Mr J Smith'. If the surname is missing for the record, the salutation would be 'Dear Customer'.

AddStr("Dear ",If(IsMissing([Surname]),"Customer",AddStr(DescOf([Title])," ",[Surname])))

 

Added in Q2 2022

Regular Expression (Regex) support in String expressions

With support for Regular Expressions and five associated string functions, you can use the FastStats Expression builder to carry out advanced string matching.

With multiple possibilities for supported syntax, Regex functions will be slow compared to other FastStats string functions and should ONLY be used to achieve an effect that you cannot achieve using existing FastStats string selection/expression functions.

RegexMatch

Returns the index of the first regular expression to be matched in the search string:

 

RegexMatch(text, regex1, regex2, ...)

 

e.g. RegexMatch([Email Address],"@.*\.com$") returns 1 for .com domains, 0 for others.

 

Invalid regular expression strings will return missing values.

 

RegexSearch

Returns the first match string of the regular expression in the search string:

 

RegexSearch(text,regex)

 

e.g. RegexSearch([Email Address],"@.*$") returns the email domain in the email address.

 

Invalid regular expression strings will return missing values.

 

RegexReplace

Returns the string where all matches of the regular expression in the search string are replaced by the string in the replace string:

 

RegexReplace(text,regex,replacestring)

 

e.g. RegexReplace([Email Address],"@.*$","@REDACTED") finds the email address domain and replaces it with the word "REDACTED"

 

Invalid regular expression strings will return missing values.

 

RegexCountMatch

Looks in the target string for matches of the regular expression and returns the number of matches of the regular expression in the target string:

 

RegexCountMatch(Target,regex) - e.g. RegexCountMatch([Telephone],"\d") returns the number of digits in the telephone number

 

RegexSearchNth

Looks in the target string for matches of the regular expression and returns the Nth match of the regular expression in the target string:

 

RegexSearchNth(Target,nth,regex)

 

e.g. RegexSearchNth([Telephone],4,"\d") returns the 4th digit of the telephone number.