C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Part 1: We assign the data string to a value containing 3 spaces—it has 4 words separated by spaces.
Part 2: This program splits on a single character. The result value from Split is a string array—it contains 4 elements.
CharPart 3: The foreach-loop iterates over the array and displays each word. The string array can be used in the same way as any other.
ForeachC# program that splits on spaces
using System;
class Program
{
static void Main()
{
// Part 1: the input string.
string data = "there is a cat";
// Part 2: split string on spaces (this will separate all the words).
string[] words = data.Split(' ');
// Part 3: loop over result array.
foreach (string word in words)
{
Console.WriteLine("WORD: " + word);
}
}
}
Output
WORD: there
WORD: is
WORD: a
WORD: cat
Argument 1: The first argument to Regex.Split is the string we wish to split. Regex.Split is a static method.
Argument 2: The second argument is the delimiter sequence. Here we split on a newline sequence.
C# program that splits on lines with Regex
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
string value = "cat\r\ndog\r\nanimal\r\nperson";
// Split the string on line breaks.
// ... The return value from Split is a string array.
string[] lines = Regex.Split(value, "\r\n");
foreach (string line in lines)
{
Console.WriteLine(line);
}
}
}
Output
cat
dog
animal
person
StringSplitOptions: This is an enum. It does not need to be allocated with a constructor—it is more like a special int value.
EnumArgument 1: Here we pass arrays for the first argument to string Split(). A char array, and string array, are used.
Argument 2: We use RemoveEntryEmpties as the second parameter to avoid empty results. They are not added to the array.
C# program that splits on multiple characters
using System;
class Program
{
static void Main()
{
// ... Parts are separated by Windows line breaks.
string value = "shirt\r\ndress\r\npants\r\njacket";
// Use a char array of 2 characters (\r and \n).
// ... Break lines into separate strings.
// ... Use RemoveEmptyEntries so empty strings are not added.
char[] delimiters = new char[] { '\r', '\n' };
string[] parts = value.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine(":::SPLIT, CHAR ARRAY:::");
for (int i = 0; i < parts.Length; i++)
{
Console.WriteLine(parts[i]);
}
// ... Same but uses a string of 2 characters.
string[] partsFromString = value.Split(new string[] { "\r\n" }, StringSplitOptions.None);
Console.WriteLine(":::SPLIT, STRING:::");
for (int i = 0; i < parts.Length; i++)
{
Console.WriteLine(parts[i]);
}
}
}
Output
:::SPLIT, CHAR ARRAY:::
shirt
dress
pants
jacket
:::SPLIT, STRING:::
shirt
dress
pants
jacket
Here: This example separates words in a string based on non-word characters. It eliminates punctuation and whitespace.
Tip: Regex provides more power and control than the string Split methods. But the code is harder to read.
Argument 1: The first argument to Regex.Split is the string we are trying to split apart.
Argument 2: This is a Regex pattern. We can specify any character set (or range) with Regex.Split.
C# program that separates on non-word pattern
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
const string sentence = "Hello, my friend";
// Split on all non-word characters.
// ... This returns an array of all the words.
string[] words = Regex.Split(sentence, @"\W+");
foreach (string value in words)
{
Console.WriteLine("WORD: " + value);
}
}
}
Output
WORD: Hello
WORD: my
WORD: friend
Regex description:
@ Special verbatim string syntax.
\W+ One or more non-word characters together.
Then: It displays the values of each line after the line number. The output shows how the file was parsed into the strings.
C# program that splits lines in file
using System;
using System.IO;
class Program
{
static void Main()
{
int i = 0;
foreach (string line in File.ReadAllLines("TextFile1.txt"))
{
string[] parts = line.Split(',');
foreach (string part in parts)
{
Console.WriteLine("{0}:{1}",
i,
part);
}
i++; // For demonstration.
}
}
}
Contents of input file: TextFile1.txt
Dog,Cat,Mouse,Fish,Cow,Horse,Hyena
Programmer,Wizard,CEO,Rancher,Clerk,Farmer
Output
0:Dog
0:Cat
0:Mouse
0:Fish
0:Cow
0:Horse
0:Hyena
1:Programmer
1:Wizard
1:CEO
1:Rancher
1:Clerk
1:Farmer
Tip: We could use Path.DirectorySeparatorChar, a char property in System.IO, for more flexibility.
PathC# program that splits Windows directories
using System;
class Program
{
static void Main()
{
// The directory from Windows.
const string dir = @"C:\Users\Sam\Documents\Perls\Main";
// Split on directory separator.
string[] parts = dir.Split('\\');
foreach (string part in parts)
{
Console.WriteLine(part);
}
}
}
Output
C:
Users
Sam
Documents
Perls
Main
Note: In this example, the input string contains five commas. These commas are the delimiters.
And: Two fields between commas are 0 characters long—they are empty. They are treated differently when we use RemoveEmptyEntries.
First call: In the first call to Split, these fields are put into the result array. These elements equal string.Empty.
Second call: We specify StringSplitOptions.RemoveEmptyEntries. The two empty fields are not in the result array.
C# program that uses StringSplitOptions
using System;
class Program
{
static void Main()
{
// Input string contain separators.
string value1 = "man,woman,child,,,bird";
char[] delimiter1 = new char[] { ',' }; // <-- Split on these
// ... Use StringSplitOptions.None.
string[] array1 = value1.Split(delimiter1, StringSplitOptions.None);
foreach (string entry in array1)
{
Console.WriteLine(entry);
}
// ... Use StringSplitOptions.RemoveEmptyEntries.
string[] array2 = value1.Split(delimiter1, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine();
foreach (string entry in array2)
{
Console.WriteLine(entry);
}
}
}
Output
man
woman
child
bird
man
woman
child
bird
Version 1: This code uses Regex.Split to separate the strings apart. It is tested on both a long string and a short string.
Version 2: Uses the string.Split method, but with the first argument being a char array. Two chars are in the char array.
Version 3: Uses string.Split as well, but with a string array argument. The 3 versions are compared.
Result: Splitting with a char array is the fastest for both short and long strings. Regex.Split is slowest (but has more features).
C# program that tests string.Split performance
using System;
using System.Diagnostics;
using System.Text.RegularExpressions;
class Program
{
const int _max = 100000;
static void Main()
{
// Get long string.
string value1 = string.Empty;
for (int i = 0; i < 120; i++)
{
value1 += "01234567\r\n";
}
// Get short string.
string value2 = string.Empty;
for (int i = 0; i < 10; i++)
{
value2 += "ab\r\n";
}
// Put strings in array.
string[] tests = { value1, value2 };
foreach (string test in tests)
{
Console.WriteLine("Testing length: " + test.Length);
// Version 1: use Regex.Split.
var s1 = Stopwatch.StartNew();
for (int i = 0; i < _max; i++)
{
string[] result = Regex.Split(test, "\r\n", RegexOptions.Compiled);
if (result.Length == 0)
{
return;
}
}
s1.Stop();
// Version 2: use char array split.
var s2 = Stopwatch.StartNew();
for (int i = 0; i < _max; i++)
{
string[] result = test.Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries);
if (result.Length == 0)
{
return;
}
}
s2.Stop();
// Version 3: use string array split.
var s3 = Stopwatch.StartNew();
for (int i = 0; i < _max; i++)
{
string[] result = test.Split(new string[] { "\r\n" }, StringSplitOptions.None);
if (result.Length == 0)
{
return;
}
}
s3.Stop();
Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) /
_max).ToString("0.00 ns"));
Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) /
_max).ToString("0.00 ns"));
Console.WriteLine(((double)(s3.Elapsed.TotalMilliseconds * 1000000) /
_max).ToString("0.00 ns"));
}
}
}
Output
Testing length: 1200
21442.64 ns Regex.Split
5562.63 ns Split char[]
6556.60 ns Split string[]
Testing length: 40
2236.22 ns Regex.Split
371.55 ns Split char[]
423.46 ns Split string[]
Version 1: This code creates a new char array with 2 elements on each Split call. These must all be garbage-collected.
Version 2: This version uses a single char array, created before the loop. It reuses the cached char array each time.
Result: By caching a char array (or string array), we can improve split call performance by a small amount.
C# program that tests Split, cached char array
using System;
using System.Diagnostics;
class Program
{
const int _max = 10000000;
static void Main()
{
string value = "a b,c";
char[] delimiterArray = new char[] { ' ', ',' };
// Version 1: split with a new char array on each call.
var s1 = Stopwatch.StartNew();
for (int i = 0; i < _max; i++)
{
string[] result = value.Split(new char[] { ' ', ',' });
if (result.Length == 0)
{
return;
}
}
s1.Stop();
// Version 2: split using a cached char array on each call.
var s2 = Stopwatch.StartNew();
for (int i = 0; i < _max; i++)
{
string[] result = value.Split(delimiterArray);
if (result.Length == 0)
{
return;
}
}
s2.Stop();
Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) /
_max).ToString("0.00 ns"));
Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) /
_max).ToString("0.00 ns"));
}
}
Output
87.61 ns Split, new char[]
84.34 ns Split, existing char[]
And: A string array can also be passed to the Split method. The new string array is created inline with the Split call.
ArrayNext: The parameters are checked for validity. It uses unsafe code to create a separator list, and a for-loop combined with Substring.
For