TheDeveloperBlog.com


Ruby Regexp Match Method

Regexp. String processing is hard. We must account for sequences and ranges of characters. Data is often imperfect. It has inconsistencies.


With Regexp, regular expressions, we use a text language to better handle this data. Ruby provides an operator, "~=" to make regular expressions easier to use.


First example. The match method applies a regular expression to a string parameter. If the regular expression does not fit, match returns nil.

Success: If the expression does match, though, we can find more out about the matched data.

Here: We iterate with "each" over three strings in a string array. We see whether the string matches the pattern.

And: If the match method returns a value, the operation succeeded. Otherwise, "m" is nil.

Based on:

Ruby 2

Ruby program that uses match

values = ["123", "abc", "456"]

# Iterate over each string.
values.each do |v|

    # Try to match this pattern.
    m = /\d\d\d/.match(v)

    # If match is not nil, display it.
    if m
	puts m
    end
end

Output

123
456

Pattern details

\d      A digit character 0-9.
\d\d\d  Three digit characters.

Operator. This performs a task similar to the match method. We place the regular expression on the left side, then use the matching operator. A string goes on the right side.

Return: If the operator returns nil, the match failed. But if the matching was successful, an integer is returned.

Tip: This integer is the index at which the match occurred. We often can use this value in an if-statement to indicate success.

Ruby program that uses match operator

# The string input.
input = "plutarch"

# If string matches this pattern, display something.
if /p.*/ =~ input
    puts "lives"
end

Output

lives

Pattern details

p     Matches lowercase letter p.
.*    Matches zero or more characters of any type.

Ignore case. A Regexp is by default case-sensitive. We can modify this by specifying a special flag "i," which stands for "ignore case" or case-insensitive.

Here: Two strings both have a common pattern: a space is followed by a letter "A."

And: With the "i" flag specified on the regular expression in the match expression, both strings are matched, despite the case difference.

Ruby program that ignores case, Regexp

# A string array.
names = ["Marcus Aurelius", "sam allen"]

# Test each name.
names.each do |name|
    # Use case-insensitive regular expression.
    if /\ a/i =~ name
	puts name
    end
end

Output

Marcus Aurelius
sam allen

Pattern description

"\ "   Matches a space.
"a"    Matches the letter "a".
i      Specifies the expression is case-insensitive.

Replace. With gsub we replace characters in a string based on a pattern. This is a replace() method that uses regular expressions.

Sub, gsub

Tip: Patterns are used throughout Ruby string methods. With gsub we use a string method with a regular expression.

Ruby program that uses gsub

value = "caaat"

# Replace multiple "a" letters with one.
result = value.gsub(/a+/, "a")

puts value
puts result

Output

caaat
cat

Split. The split method also accepts regular expressions. We use a Regexp to specify the delimiter pattern. So we then extract the parts that are separated by the matching patterns.

Split

Word count. With split, we can count words in a string. We simply split on non-word characters and return the length of the array.

Word Count

Often, we require no regular expressions. We just use strings, and string methods. But in many cases, complexities and inconsistencies surface. Regexp then becomes a better approach.