C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
With Regex, we use a text-processing language. This language easily handles string data.
Functions. With Match, we search strings. And with Replace, we change those we find. And often RegexOptions are used to change how these functions are evaluated.
Match. This program uses Regex. Please notice the System.Text.RegularExpressions namespace. The Regex pattern "\d+" matches one or more digit characters together.
Success: We test if the match is successful. If it is, we print (with Console.WriteLine) its value—the string "77."
Based on: .NET 4.5 VB.NET program that uses Regex Imports System.Text.RegularExpressions Module Module1 Sub Main() Dim regex As Regex = New Regex("\d+") Dim match As Match = regex.Match("Dot 77 Perls") If match.Success Then Console.WriteLine(match.Value) End If End Sub End Module Output 77
IgnoreCase. Next, we use different syntax, and an option, for Match. We call the Regex.Match shared Function—no Regex object is needed. We then specify an option, RegexOptions.IgnoreCase.
IgnoreCase: This enum value, a constant, specifies that lower and uppercase letters are equal.
VB.NET program that uses RegexOptions.IgnoreCase Imports System.Text.RegularExpressions Module Module1 Sub Main() ' Match ignoring case of letters. Dim match As Match = Regex.Match("I like that cat", "C.T", RegexOptions.IgnoreCase) If match.Success Then ' Write value. Console.WriteLine(match.Value) End If End Sub End Module Output cat
Groups. This example uses Match and Groups. We specify the case of letters is unimportant with RegexOptions.IgnoreCase. And finally we test for Success on the Match object received.
Info: When we execute this program, we see the target text was successfully extracted from the input.
Groups index: We use the value 1 to get the first group from the Match. With Regex, indexing starts at 1 not 0 (don't ask why).
VB.NET program that uses Regex.Match Imports System.Text.RegularExpressions Module Module1 Sub Main() ' The input string. Dim value As String = "/content/alternate-1.aspx" ' Invoke the Match method. Dim m As Match = Regex.Match(value, _ "content/([A-Za-z0-9\-]+)\.aspx$", _ RegexOptions.IgnoreCase) ' If successful, write the group. If (m.Success) Then Dim key As String = m.Groups(1).Value Console.WriteLine(key) End If End Sub End Module Output alternate-1
Shared. A Regex object requires time to be created. We can instead share Regex objects, with the shared keyword. A shared Regex object is faster than shared Regex Functions.
Therefore: Storing a Regex as a field in a module or class often results in a speed boost, when Match is called more than once.
Function: The Match function is an instance function on a Regex object. This program has the same result as the previous program.
VB.NET program that uses Match on Regex field Imports System.Text.RegularExpressions Module Module1 ''' <summary> ''' Member field regular expression. ''' </summary> Private _reg As Regex = New Regex("content/([A-Za-z0-9\-]+)\.aspx$", _ RegexOptions.IgnoreCase) Sub Main() ' The input string. Dim value As String = "/content/alternate-1.aspx" ' Invoke the Match method. ' ... Use the regex field. Dim m As Match = _reg.Match(value) ' If successful, write the group. If (m.Success) Then Dim key As String = m.Groups(1).Value Console.WriteLine(key) End If End Sub End Module Output alternate-1
Match, NextMatch. The Match() Function returns the first match only. But we can call NextMatch() on that returned Match object. This is a match that is found in the text, further on.
Tip: NextMatch can be called in a loop. This results in behavior similar to the Matches method (which may be easier to use).
VB.NET program that uses Match, NextMatch Imports System.Text.RegularExpressions Module Module1 Sub Main() ' Get first match. Dim match As Match = Regex.Match("4 and 5", "\d") If match.Success Then Console.WriteLine(match.Value) End If ' Get next match. match = match.NextMatch() If match.Success Then Console.WriteLine(match.Value) End If End Sub End Module Output 4 5
IsMatch. This returns true if a String matches the regular expression. We get a Boolean that tells us whether a pattern matches. If no other results are needed, IsMatch is useful.
Here: This program introduces the IsValid Boolean function, which computes the result of the Regex.IsMatch function on its parameter.
Note: The regular expression pattern indicates any string of lowercase ASCII letters, uppercase ASCII letters, or digits.
VB.NET program that uses Regex.IsMatch function Imports System.Text.RegularExpressions Module Module1 Function IsValid(ByRef value As String) As Boolean Return Regex.IsMatch(value, "^[a-zA-Z0-9]*$") End Function Sub Main() Console.WriteLine(IsValid("TheDeveloperBlog0123")) Console.WriteLine(IsValid("DotNetPerls")) Console.WriteLine(IsValid(":-)")) End Sub End Module Output True True False
Matches. This function is used to locate and return parts of the source String in separate variables. To capture groups, we use parentheses in a Regex pattern.
Regex.MatchesRegex.Matches Quote
Some examples. In these programs, I remove HTML tags. Be careful—this does not work on all HTML. I also count words in English text (this is also not perfect).
Replace. This function takes an optional MatchEvaluator. It will perform both a matching operation and are placement of the matching parts.
Split. Sometimes the String Split function is just not enough for our splitting needs. For these times, try the Regex.Split function.
In some programs, a Regex function is the easiest way to process text. In others, it adds complexity for little gain. As developers, we must decide when to use Regex.
At its core, the Regex type exposes a text-processing language, one built upon finite deterministic automata. Tiny programs efficiently manipulate text.