C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Note: The ReadLine() method will return each line separately, or null if there are no more data.
ReadLine, ReadLineAsyncThen: We put the regular expression logic into the StreamReader code to parse an entire file. We match each line.
Tip: Groups is indexed starting at 1. Never access Groups[0], which can result in lots of grief as your algorithm does not work.
C# program that matches lines
using System;
using System.IO;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
Regex regex = new Regex(@"\s/Content/([a-zA-Z0-9\-]+?)\.aspx");
// "\s/Content/" : space and then Content directory
// "([a-zA-Z0-9\-]+?) : group of alphanumeric characters and hyphen
// ? : don't be greedy, match lazily
// \.aspx : file extension required for match
using (StreamReader reader = new StreamReader(@"C:\programs\log.txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
// Try to match each line against the Regex.
Match match = regex.Match(line);
if (match.Success)
{
// Write original line and the value.
string v = match.Groups[1].Value;
Console.WriteLine(line);
Console.WriteLine("... " + v);
}
}
}
}
}
Contents, log.txt:
2008-10-16 23:56:44 W3SVC2915713 GET /Content/String.aspx - 80 66.249
2008-10-16 23:59:50 W3SVC2915713 GET /Content/Trim-String-Regex.aspx - 80 66.249
Output
2008-10-16 23:56:44 W3SVC2915713 GET /Content/String.aspx - 80 66.249
... String
2008-10-16 23:59:50 W3SVC2915713 GET /Content/Trim-String-Regex.aspx - 80 66.249
... Trim-String-Regex
Tip: Processing each line separately may be faster because less memory must be accessed and fewer characters must be checked.
Review: We combined the StreamReader class with the Regex class in the base class library to parse large text files.