C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Static: This code is ideally contained in static methods because it doesn't maintain state or any data. You can think of it as an action, not an object.
StaticCountWords1: This is shorter and simpler to maintain, and is also more accurate. The backslash-S characters (\S) mean characters that are not spaces.
So: CountWords1 considers each non-letter character to be part of a word, similar to Microsoft Word.
CountWords2: This version of the code uses a for-loop, and tries to correctly count word breaking characters.
ForCharC# program that counts words
using System;
using System.Text.RegularExpressions;
class Program
{
    static void Main()
    {
        const string t1 = "To be or not to be, that is the question.";
        Console.WriteLine(WordCounting.CountWords1(t1));
        Console.WriteLine(WordCounting.CountWords2(t1));
        const string t2 = "Mary had a little lamb.";
        Console.WriteLine(WordCounting.CountWords1(t2));
        Console.WriteLine(WordCounting.CountWords2(t2));
    }
}
/// <summary>
/// Contains methods for counting words.
/// </summary>
public static class WordCounting
{
    /// <summary>
    /// Count words with Regex.
    /// </summary>
    public static int CountWords1(string s)
    {
        MatchCollection collection = Regex.Matches(s, @"[\S]+");
        return collection.Count;
    }
    /// <summary>
    /// Count word with loop and character tests.
    /// </summary>
    public static int CountWords2(string s)
    {
        int c = 0;
        for (int i = 1; i < s.Length; i++)
        {
            if (char.IsWhiteSpace(s[i - 1]) == true)
            {
                if (char.IsLetterOrDigit(s[i]) == true ||
                    char.IsPunctuation(s[i]))
                {
                    c++;
                }
            }
        }
        if (s.Length > 2)
        {
            c++;
        }
        return c;
    }
}
Output
10
10
5
5
Accuracy of word counting methods:
Document A
    Microsoft Word: 4007 words
    Regex method:   3990 words [closest]
    Loop method:    3973 words
Document B
    Microsoft Word: 1414 words
    Regex method:   1414 words [closest]
    Loop method:    1399 words
Document C
    Microsoft Word: 462 words
    Regex method:   463 words [closest]
    Loop method:    459 words
Document D
    Microsoft Word: 470 words
    Regex method:   470 words [closest]
    Loop method:    465 words
Document E
    Microsoft Word: 2742 words
    Regex method:   2738 words [closest]
    Loop method:    2710 words
Example input and output
Input:      To be or not to be, that is the question.
            Mary had a little lamb.
Word count: 10
            5
However: Their greater ease of use and clarity is often more important. In scripting languages, regular expressions often perform better.
Tip: You can store the Regex object it uses as an instance member or field of the class.
Then: You can simply call its instance Matches method instead of the static Regex.Matches method. This improves speed.
Note: If you omit a character from the ranges, that character is considered a word separator.
Here: You can see that with this version of the Regex, the substring "is#the#question" is treated as three separate words.
Tip: This is because the pound sign is not included in the ranges of valid characters in the pattern.
And: With this form of the Regex pattern, you can more easily change which characters are valid and which are not.
C# program that uses modified Regex
using System;
using System.Text.RegularExpressions;
class Program
{
    static void Main()
    {
        const string t1 = "To be or not to be, that is#the#question.";
        Console.WriteLine(CountWordsModified(t1));
    }
    static int CountWordsModified(string s)
    {
        return Regex.Matches(s, @"[A-Za-z0-9]+").Count;
    }
}
Output
10