TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

C# Regex.Split Examples

These C# example programs use the Regex.Split method. They split strings based on patterns.

Regex.Split separates strings based on a pattern.

It handles a delimiter specified as a pattern—such as \D+ which means non-digit characters. This yields a greater level of flexibility and power than string.Split.

Example. First, we get all numbers in a string, and then actually parse them into integers for easier usage in a C# program. The important part of the example is that it splits on all non-digit values in the string.

Then: It loops through the result strings, with a foreach-loop, and uses int.TryParse.

C# program that uses Regex.Split

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
	//
	// String containing numbers.
	//
	string sentence = "10 cats, 20 dogs, 40 fish and 1 programmer.";
	//
	// Get all digit sequence as strings.
	//
	string[] digits = Regex.Split(sentence, @"\D+");
	//
	// Now we have each number string.
	//
	foreach (string value in digits)
	{
	    //
	    // Parse the value to get the number.
	    //
	    int number;
	    if (int.TryParse(value, out number))
	    {
		Console.WriteLine(value);
	    }
	}
    }
}

Output

10
20
40
1

In this example, the input string contains the numbers 10, 20, 40 and 1, and the static Regex.Split method is called with two parameters. The string @"\D+" is a verbatim string literal that designates all NON-digit characters.

Regex.Split NumbersStatic Method

Tip: When a regex pattern has an escaped uppercase letter like \D, it means NOT.

Example 2. Here we extract all substrings in a string that are separated by whitespace characters. You could also use string.Split. But this version is simpler and can also be more easily extended.

Note: The example gets all operands and operators from an equation string. An operand is a character like * that acts on operands.

C# program that tokenizes

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
	//
	// The equation.
	//
	string operation = "3 * 5 = 15";
	//
	// Split it on whitespace sequences.
	//
	string[] operands = Regex.Split(operation, @"\s+");
	//
	// Now we have each token.
	//
	foreach (string operand in operands)
	{
	    Console.WriteLine(operand);
	}
    }
}

Output

3
*
5
=
15

In this program, we implemented a simple tokenizer. Computer programs and languages first undergo lexical analysis and tokenization. This step gets all the tokens such as those shown in the output above.

Token

Info: This is an effective way to parse computer languages or program output. It is not the fastest way.

Example 3. Here we look at a method that gets all the words that have an initial uppercase letter in a string. The Regex.Split call used actually just gets all the words. The loop checks the first letter for its case.

Tip: It is often useful to combine regular expressions and manual looping and string operations. Programs are not art projects.

C# program that collects uppercase words

using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
	//
	// String containing uppercased words.
	//
	string sentence = "Bob and Michelle are from Indiana.";
	//
	// Get all words.
	//
	string[] uppercaseWords = Regex.Split(sentence, @"\W");
	//
	// Get all uppercased words.
	//
	var list = new List<string>();
	foreach (string value in uppercaseWords)
	{
	    //
	    // Check the word.
	    //
	    if (!string.IsNullOrEmpty(value) &&
		char.IsUpper(value[0]))
	    {
		list.Add(value);
	    }
	}
	//
	// Write all proper nouns.
	//
	foreach (var value in list)
	{
	    Console.WriteLine(value);
	}
    }
}

Output

Bob
Michelle
Indiana

Discussion. For performance you may want to try using the string Split method, which is an instance method on the string type, instead of regular expressions. That method is more appropriate for precise and predictable input.

Also: You can change the Regex.Split method call into an instance Regex. This enhances performance and reduces memory pressure.

Regex Performance

Further: You can use the RegexOptions.Compiled enumerated constant for greater performance.

RegexOptions.Compiled

Summary. We extracted strings with the Regex.Split method, using patterns of non-digit characters, whitespace characters, and non-word characters. We processed the string array result of Regex.Split by parsing the integers in a sentence.

Tip: Using loops on the results of Regex.Split is an easy way to further filter your results.


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf