StringUtils Methods
[for version 1.3 as of 20.Aug.05]

A synopsis of the methods in Joe Strout’s StringUtils module for REALbasic. See Joe Strout’s REALbasic page (stringutils)

Chop (s As String, charsToCut As Integer) as string
  Return s with the rightmost charsToCut chars removed

ChopB (s As String, bytesToCut As Integer) as String
  Return s with the rightmost bytesToCut bytes removed.

Contains (extends s As String, what As String) as Boolean
  Return true if s contains the substring what. By "contains" we mean case-insensitive, encoding-savvy containment as with InStr.

ContainsB (extends s As String, what As String) as Boolean
  // Return true if s contains the substring what. By "contains" we mean binary containment as with InStrB.

ControlCharacters () as String
  Return the control character region of the ASCII set, i.e., ASCII 0 through 31.

Count (source As String, substr As String) as Integer
  Return how many non-overlapping occurrences of substr there are in source.

CountB (source As String, substr As String) as Integer
  Return how many non-overlapping occurrences of substr there are in source, doing binary comparison.

CountFieldsQuoted (src as string, sep as string) as Integer
  Equivalent to RB's CountFields() function, but respects quoted values
Usage: s = """Hello, Kitty"", ""One"", ""Two, Three""" x = CountFieldsQuoted(s, ",") result: x=3

CountRegEx (s As String, pattern As String) as Integer
  Count the number of occurrences of a RegEx pattern within a string.

DecimalSeparator () as String
  Return the decimal separator the user uses (either "." or ",").

EditDistance (s1 As String, s2 As String) as Integer
  Return the Levenshein distance, aka the edit distance, between the two strings. That's the number of insertions, deletions, or changes required to make one string match the other.
A result of 0 means the strings are identical; higher values mean more different.
Note that this function is case-sensitive; if you want a case-insensitive measure, simply Uppercase or Lowercase both strings before calling.
Implementation adapted from http://www.merriampark.com/ld.htm, though we're using only a 1D array since the 2D array is wasteful.

EndsWith (extends s As String, withWhat As String) as Boolean
  Return true if s ends with the string withWhat, doing a standard string comparison.

EndsWithB (extends s As String, withWhat As String) as Boolean
  Return true if s ends with the string withWhat, doing a binary comparison.

Hash (s As String) as Integer
  Return the hash value of the given string, as used by RB's Variant and Dictionary classes.

HexB (s As String) as String
  Return a hex representation of each byte of s, i.e., each byte becomes a pair of hexadecimal digits, separated by spaces from the next byte.

InStrReverse (startPos As Integer=-1, source As String, substr As String) as Integer
  Similar to InStr, but searches backwards from the given position (or if startPos = -1, then from the end of the string). If substr can't be found, returns 0.

InStrReverseB (startPosB As Integer=-1, source As String, substr As String) as Integer
  Similar to InStrB, but searches backwards from the given position (or if startPosB = -1, then from the end of the string). If substr can't be found, returns 0.

IsEmpty (extends s As String) as Boolean
  Return true if the string is empty.

LTrim (source As String, charsToTrim As String) as String
  This is an extended version of RB's LTrim function that lets you specify a set of characters to trim.

Metaphone (source As String, ByRef outPrimary As String, ByRef outAlternate As String)
  Compute the Double Metaphone of the source string. This is an algorithm that finds one or two approximate phonetic representations of a string, useful in searching for almost-matches -- e.g., looking for names whose spelling may have varied, or correcting typos made by the user, and so on. The output is roughly human-readable, with the following conventions: Vowels are omitted from the output, except for a vowel at the beginning of a word, which is represented by an A (e.g. "ox" becomes "AKS") X is used to represent a "ch" sound (e.g., "church" becomes "XRX") 0 (zero) is used to represent a "th" sound (e.g. "think" becomes "0NK") For more information about Double Metaphone, see:
http://aspell.sourceforge.net/metaphone/
http://www.cuj.com/articles/2000/0006/0006d/0006d.htm?topic=articles [This link now redirects to another unrelated site. Thanks to Ann Sykes for reporting it.]
This implementation is based on the one at:
http://aspell.sourceforge.net/metaphone/dmetaph.cpp.

  MIsVowel (source As String, atPos As Integer) as Boolean
  This is a private helper function for the Metaphone method.

  MStringAt (source As String, start As Integer, length As Integer, paramArray args As String) as Boolean
  This is a private helper function for the Metaphone method.

NthFieldQuoted (src as string, sep as string, index as integer) as string
  Equivalent to RB's nthField() function, but respects quoted values Usage: s = """Hello, Kitty"", ""One"", ""Two, Three""" s1 = nthFieldQuoted(s, ",", 3) result: s1 = "Two, Three" (including the quotes!)

PadBoth (s as String, width as Integer, padding as String = " ") as string
  Pad a string to at least width characters, by adding padding characters to the left and right sides of the string. If it is impossible to center the string, the string will be one character to the right more than it is to the left.

PadLeft (s as String, width as Integer, padding as String = " ") as string
  Pad a string to at least width characters, by adding padding characters to the left side of the string.

PadRight (s as String, width as Integer, padding as String = " ") as string
  Pad a string to at least width characters, by adding padding characters to the right side of the string.

Remove (s As String, charSet As String=" ") as string
  Delete all characters which are members of charSet. Example: Delete("wooow maaan", "aeiou") = "ww mn".

Repeat (s as String, repeatCount as Integer) as string
  Concatenate a string to itself repeatCount times. Example: Repeat("spam ", 5) = "spam spam spam spam spam ".

ReplaceRange (s As String, start As Integer, length As Integer, newText As String) as string
  Replace a part of the given string with a new string.

ReplaceRangeB (s As String, startB As Integer, lengthB As Integer, newText As String) as string
  Replace a part of the given string with a new string (with offset and length in bytes rather than characters).

Reverse (s As String) as string
  Return s with the characters in reverse order.

ReverseB (s As String) as string
  Return s with the bytes in reverse order. Note that if s is text in any encoding that may have multi-byte characters, you should probably be using Reverse instead of ReverseB.

RTrim (source As String, charsToTrim As String) as string
  This is an extended version of RB's RTrim function that lets you specify a set of characters to trim.

Soundex (s As String, stripPrefix As Boolean = true) as string
  Return the Soundex code for the given string. That's the first character, followed by numeric codes for the first several consonants. For more detail, see: http://www.searchforancestors.com/soundex.html

SplitByLength (s As String, fieldWidth As Integer) as String()
  Split a string into fields, each containing fieldWidth characters (except for the last one, which may have fewer).

SplitByLengthB (s As String, fieldWidth As Integer) as String()
  Split a string into fields, each containing fieldWidth bytes (except for the last one, which may have fewer).

SplitByRegEx (source As String, delimPattern As String) as String()
  Split a string into fields delimited by a regular expression.

SplitToCDbl (source As String, delimiter As String=" ") as Double()
  Split a string into fields, then convert each field into a Double using the CDbl function. This is appropriate for a set of numbers entered or readable by the end-user.

SplitToInt (source As String, delimiter As String=" ") as Integer()
  Split a string into fields, then convert each field into an Integer using the Val function.

SplitToVal (source As String, delimiter As String=" ") as Double()
  Split a string into fields, then convert each field into a Double using the Val function. This is appropriate for a set of numbers used only by the computer; for human-readable numbers, consider using SplitToCDbl instead.

Sprintf (src as string, ParamArray data as Variant) as string
  Returns a string produced according to the formatting string . The format string is composed of zero or more directives: ordinary characters (excluding %) that are copied directly to the result, and conversion specifications, each of which results in fetching its own parameter. For details, see http://de.php.net/manual/en/function.sprintf.php
Attention: This function differs from the PHP sprintf() function in that it formats floating numbers according to the locale settings. For example, in Germany, sprintf("%04.2f", 123.45) will return "0123,45".
Written by Frank Bitterlich, bitterlich@gsco.de Additional work by Florent Pillet, florent@florentpillet.com

SQLify (s As String) as string
  Return a version of s ready for use in an SQL statement.

Squeeze (s As String, charSet As String=" ") as string
  Find any repeating characters, where the character is a member of charSet, and replace the run with a single character. Example: Squeeze("wooow maaan", "aeiou") = "wow man".

StartsWith (extends s As String, withWhat As String) as Boolean
  Return true if s starts with the string withWhat, doing a standard string comparison.

StartsWithB (extends s As String, withWhat As String) as Boolean
  Return true if s starts with the string withWhat, doing a binary comparison.

ThousandsSeparator () as String
  Return the thousands separator the user uses (either "." or ",").

Trim (source As String, charsToTrim As String) as String
  This is an extended version of RB's Trim function that lets you specify a set of characters to trim.

 


The StringUtils module is maintained by Joe Strout (joe at strout.net). You should be able to find the latest version via this URL: http://www.strout.net/info/coding/rb/

This summary was created by Dave Charlesworth from the inline comments in each method of the module. Please report any errors or omissions to me at maclists at additional.com. Thanks.