C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
WebClient is powerful. It is versatile. This class makes it possible to easily download web pages for testing.
Example. First, to use the WebClient class you need to either use the fully specified name System.Net.WebClient or include the System.Net namespace. This example creates a new WebClient object instance and sets its user agent.
Then: This WebClient will download a page and the server will think it is Internet Explorer 6. It gets a byte array of data.
C# program that uses client user-agent using System; using System.Net; class Program { static void Main() { // Create web client simulating IE6. using (WebClient client = new WebClient()) { client.Headers["User-Agent"] = "Mozilla/4.0 (Compatible; Windows NT 5.1; MSIE 6.0) " + "(compatible; MSIE 6.0; Windows NT 5.1; " + ".NET CLR 1.1.4322; .NET CLR 2.0.50727)"; // Download data. byte[] arr = client.DownloadData(""); // Write values. Console.WriteLine("--- WebClient result ---"); Console.WriteLine(arr.Length); } } } Output --- WebClient result --- 6585
You can add a new HTTP header to your WebClients download request by assigning an entry in the Headers collection. You can also use the WebHeaderCollection returned by Headers and call the Add, Remove, Set and Count methods on it.
Byte arrays. The DownloadData instance method on the WebClient is called and its reference return value is assigned to a new byte array reference. Internally, the DownloadData method will allocate the bytes on the managed heap.
Tip: When you assign the result to the variable, you are doing a bitwise copy of the reference to that data.
Also, we use the using-statement to ensure that the system resources for the WebClient are cleaned up by the system by placing them on the finalization queue. This is critical for longer and complex programs.
Example 2. This example uses two HTTP request headers set on the Headers collection on WebClient. It then reads in the ResponseHeaders collection. This helps you make sure your web server returns the proper headers for certain clients.
Tip: To set many request headers, simply assign the string keys to the string values you want the headers to be set to.
C# program that uses Headers using System; using System.Net; class Program { static void Main() { // Create web client. WebClient client = new WebClient(); // Set user agent and also accept-encoding headers. client.Headers["User-Agent"] = "Googlebot/2.1 (+http://www.googlebot.com/bot)"; client.Headers["Accept-Encoding"] = "gzip"; // Download data. byte[] arr = client.DownloadData(""); // Get response header. string contentEncoding = client.ResponseHeaders["Content-Encoding"]; // Write values. Console.WriteLine("--- WebClient result ---"); Console.WriteLine(arr.Length); Console.WriteLine(contentEncoding); } } Output --- WebClient result --- 2040 gzip
Content-encoding. This part of the example gets a response HTTP header using the client.ResponseHeaders collection. You can access this much like a hashtable or dictionary. If there is no header set for that key, the result is null.
Example 3. Next, we download a web page from the Internet into a string. We create a new WebClient class instance and then specify the URL we want to download as the parameter to the DownloadString method, which will return a string.
Note: If no accept-encoding was specified, the server usually returns a plain text string.
C# program that uses DownloadString using System; using System.Net; class Program { static void Main() { // Create web client. WebClient client = new WebClient(); // Download string. string value = client.DownloadString(""); // Write values. Console.WriteLine("--- WebClient result ---"); Console.WriteLine(value.Length); Console.WriteLine(value); } }
Internally, the DownloadString method will call into lower-level system routines in the Windows network stack. It will allocate the resulting string on the managed heap. Then it will return a value referencing that data.
Headers. You can set the request HTTP headers on the WebClient class. The examples in this article show that you can do this either through the Headers get accessor, such as Headers["a"] = "b".
Also: You can access the Headers variable as a WebHeaderCollection, allowing to perform more complex logic on the values.
Response headers. You can access the response HTTP headers after you invoke DownloadData or DownloadString. For ASP.NET, headers are found in ResponseHeaders. This is helpful for testing that all responses from your site are valid.
Threads. It is possible to access web pages on separate threads in your C# program using WebClient. The WebClient class in System.Net provides OpenReadAsync, DownloadDataAsync, DownloadFileAsync and DownloadStringAsync methods.
Note: These allow you to continue running the present method while the download has not completed, and they return void.
Also, depending on the use of your program, it is sometimes better to put the WebClient code in a BackgroundWorker and access it synchronously on a separate thread. This can allow clearer code and logic for the calling code.
Tip: For simple quality analysis tools, it is best to avoid threading entirely, as it will likely cause bugs.
Dispose. The WebClient class in the .NET Framework holds onto some system resources which are required to access the network stack in Microsoft Windows. The behavior of the CLR will ensure these resources are eventually cleaned up.
However, if you manually call Dispose or use the using-statement, you can make these resources be cleaned up at more predictable times. This can improve the performance of larger programs.
Console program. This console program receives the target URL you want to download, and the local file you want to append to. If the local file is not found, it will be created. If the target URL is not found, an exception will be thrown and reported.
C# program that downloads web page and saves it using System; using System.IO; using System.Net; class Program { static void Main(string[] args) { try { Console.WriteLine("*** Log Append Tool ***"); Console.WriteLine(" Specify file to download, log file"); Console.WriteLine("Downloading: {0}", args[0]); Console.WriteLine("Appending: {0}", args[1]); // Download url. using (WebClient client = new WebClient()) { string value = client.DownloadString(args[0]); // Append url. File.AppendAllText(args[1], string.Format("--- {0} ---\n", DateTime.Now) + value); } } finally { Console.WriteLine("[Done]"); Console.ReadLine(); } } } Program usage 1. Compile to EXE. 2. Make shortcut to the EXE. 3. Specify the target URL and the local file to append to. Such as "http://test/index" "C:\test.txt"
This program can monitor how a specific text file on the Internet changes. For example, if your website exposes some statistics or debugging information at a certain URL, you can configure this program to download that data and log it.
Also: It is possible to use this program on a timer or invoke the program through other programs, with the Process.Start method.
Tip: You can write a console program that accesses a specific URL and then stores it in a log file. The program here is configurable.
Time downloads. This program implements a console application that allows you to time a certain web page at any URL. It downloads the web page a certain number of times. It then reports the total and average time required for downloading the page.
C# program that times web page downloads using System; using System.Diagnostics; using System.Net; class Program { const int _max = 5; static void Main(string[] args) { try { // Get url. string url = args[0]; // Report url. Console.ForegroundColor = ConsoleColor.White; Console.WriteLine("... PageTimeTest: times web pages"); Console.ResetColor(); Console.WriteLine("Testing: {0}", url); // Fetch page. using (WebClient client = new WebClient()) { // Set gzip. client.Headers["Accept-Encoding"] = "gzip"; // Download. // ... Do an initial run to prime the cache. byte[] data = client.DownloadData(url); // Start timing. Stopwatch stopwatch = Stopwatch.StartNew(); // Iterate. for (int i = 0; i < Math.Min(100, _max); i++) { data = client.DownloadData(url); } // Stop timing. stopwatch.Stop(); // Report times. Console.WriteLine("Time required: {0} ms", stopwatch.Elapsed.TotalMilliseconds); Console.WriteLine("Time per page: {0} ms", stopwatch.Elapsed.TotalMilliseconds / _max); } } catch (Exception ex) { Console.WriteLine(ex.ToString()); } finally { Console.WriteLine("[Done]"); Console.ReadLine(); } } } Usage Create a shortcut of the EXE of the program. Then specify the URL on the command-line in the shortcut.
In this example, we use a try-catch-finally block. The program begins in the try block. Here it reads the command-line argument and writes the parameters to the screen. It sets the Accept-Encoding HTTP header.
Then: It downloads the page up to 100 times. It averages the total milliseconds elapsed and prints this to the screen as well.
GZIP headers. Performance-oriented web sites use GZIP compression for transferring pages. This is one of the more important performance tasks. For this reason, the program only tests GZIP pages by setting the Accept-Encoding header.
Next, we show how the program works when benchmarking pages. For the example, we will use Google, because they have lots of bandwidth. This test run shows that the Google homepage was loaded in about 52 milliseconds on average.
Possible results ... PageTimeTest: times loads of web page over network Testing: http://www.google.com/ Time required: 259.7351 ms Time per page: 51.94702 ms [Done]
Caution: This code has many limitations and does not adequately simulate the web browser environment. But it is helpful for benchmarking.
Summary. We used the WebClient class in the System.Net namespace. This class allows us to download web pages into strings and byte arrays. It is recommended for testing web sites or for developing programs that must fetch some external resources.