TheDeveloperBlog.com


C# Line Count File Method

Line count. Strings often contain many lines. It is sometimes necessary to count these lines. This is needed for server logs and CSV files. We can use Regex and string-handling methods. For files, we can use StreamReader and ReadLine.

StreamReaderReadLine

Files. First, when using large files it is more memory-efficient not to store the entire file contents in RAM at once. And for web server log files, you need efficiency. This next block of code counts the lines in a file on the disk.

It does this by using the parameterless ReadLine instance method offered by the StreamReader class in the System.IO namespace. The CountLinesInFile method is static because it stores no state.

Static Methods
C# program that counts file lines

using System.IO;

class Program
{
    static void Main()
    {
	CountLinesInFile("test.txt");
    }

    /// <summary>
    /// Count the number of lines in the file specified.
    /// </summary>
    /// <param name="f">The filename to count lines.</param>
    /// <returns>The number of lines in the file.</returns>
    static long CountLinesInFile(string f)
    {
	long count = 0;
	using (StreamReader r = new StreamReader(f))
	{
	    string line;
	    while ((line = r.ReadLine()) != null)
	    {
		count++;
	    }
	}
	return count;
    }
}

Output

(Number of lines in text.txt)

It uses StreamReader, which is a useful class for reading in files quickly. It has the useful ReadLine method, which tells the Framework to read until the next line break in the file.

Next: It increments the count, which contains the number of lines in the file. Finally the int containing the count is returned.


Strings. Next we look at the string-based method. Also, we see the regular expression based method, which requires the System.Text.RegularExpressions namespace. The methods have the same result. You may want to place them in a separate class.

C# program that counts lines

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main()
    {
	long a = CountLinesInString("This is an\r\nawesome website.");
	Console.WriteLine(a);

	long b = CountLinesInStringSlow("This is an awesome\r\nwebsite.\r\nYeah.");
	Console.WriteLine(b);
    }

    static long CountLinesInString(string s)
    {
	long count = 1;
	int start = 0;
	while ((start = s.IndexOf('\n', start)) != -1)
	{
	    count++;
	    start++;
	}
	return count;
    }

    static long CountLinesInStringSlow(string s)
    {
	Regex r = new Regex("\n", RegexOptions.Multiline);
	MatchCollection mc = r.Matches(s);
	return mc.Count + 1;
    }
}

Output

2
3

The first method uses IndexOf. It finds all the newline characters. The second method uses MatchCollection. This method uses the Matches method with RegexOptions.Multiline. This enumeration value allows newlines to be matched.

IndexOf

Performance. Here we compare the performance of these methods. I benchmarked the above methods for one million operations on real-world files to see just how different they perform. The timing results were more than one order of magnitude different.

So: Generally there is no advantage to the regular expression library when doing simple operations such as this.

Benchmark results for line counting

Version A - Count:  188 ms [faster]
Version B - Regex: 5959 ms


Summary. We counted the number of lines in files by using StreamReader and ReadLine. We also looked at the ReadLine method and discussed other ways of counting lines in text data using the C# language.

System.IO