Download file in chunks in parallel in C#

Improved file download using chunks of file in parallel in C#

Downloading large files from your code may cause problems due to limitations in your network or system where your code is executing. For example, some systems limit the size of the file you can download through the network. This is a common case in highly controlled environments.

Another aspect is download speed. Basically when you open the response stream in your code, you are reading bytes in order which means you could technically define stream ranges and read them in parallel and therefore use the power of multiple cores n your machine to achieve faster download. Similar principle is used in download manager applications with addition for download resuming.

The following approach enables the power of parallel operations on multi-core machines and can be used as a base for download resume. This is only the base implementation which allows downloading files in chunks in parallel.

Approach consists of few simple basic steps:

  • acquire the file size by making http request with HEAD method
  • calculate the size of chunks based on desired number of parallel downloads
  • initiate download of each chunk in parallel and save to a separate file
  • merge all chunk files in a single final file
  • delete all temporary files

And now to begin with hands on code. First thing I decided to do is to do is to handle the response stream ranges in a collection of model objects. I could got with dictionary in this case, but using a model class seemed more readable solution.

namespace Downloader.App
{
    internal class Range
    {
        public long Start { get; set; }
        public long End { get; set; }
    }
}

    

Before we switch to the logic we need to declare a model for a result. We are going to need few infos for the invoker of the download method. I found following properties useful, so I put them as a part of a download result method

using System;

namespace Downloader.App
{
    public class DownloadResult
    {
        public long Size { get; set; }
        public String FilePath { get; set; }
        public TimeSpan TimeTaken { get; set; }
        public int ParallelDownloads { get; set; }
    }
}

    

And now to the main stuff. The following method accepts the download url, destination path as well as optional number of parallel downloads and whether you want to skip SSL validation if you are downloading from HTTPS url

using System;using System.Collections.Generic;using System.Collections.Concurrent;using System.IO;using System.Linq;using System.Net;using System.Threading.Tasks;namespace Downloader.App{public static class Downloader{static Downloader(){ServicePointManager.Expect100Continue = false;ServicePointManager.DefaultConnectionLimit = 100;ServicePointManager.MaxServicePointIdleTime = 1000;}public static DownloadResult Download(String fileUrl, String destinationFolderPath, int numberOfParallelDownloads = 0, bool validateSSL = false){if (!validateSSL){ServicePointManager.ServerCertificateValidationCallback = delegate { return true; };}Uri uri = new Uri(fileUrl);//Calculate destination pathString destinationFilePath = Path.Combine(destinationFolderPath, uri.Segments.Last());DownloadResult result = new DownloadResult() { FilePath = destinationFilePath };//Handle number of parallel downloadsif (numberOfParallelDownloads <= 0){numberOfParallelDownloads = Environment.ProcessorCount;}#region Get file sizeWebRequest webRequest = HttpWebRequest.Create(fileUrl);webRequest.Method = "HEAD";long responseLength;using (WebResponse webResponse = webRequest.GetResponse()){responseLength = long.Parse(webResponse.Headers.Get("Content-Length"));result.Size = responseLength;}#endregionif (File.Exists(destinationFilePath)){File.Delete(destinationFilePath);}using (FileStream destinationStream = new FileStream(destinationFilePath, FileMode.Append)){ConcurrentDictionary<int, String> tempFilesDictionary = new ConcurrentDictionary<int, String>();#region Calculate rangesList<Range> readRanges = new List<Range>();for (int chunk = 0; chunk < numberOfParallelDownloads - 1; chunk++){var range = new Range(){Start = chunk * (responseLength / numberOfParallelDownloads),  End = ((chunk + 1) * (responseLength / numberOfParallelDownloads)) - 1
                    };
                    readRanges.Add(range);
                }


                readRanges.Add(new Range()
                {
                    Start = readRanges.Any() ? readRanges.Last().End + 1 : 0,
                    End = responseLength - 1
                });

                #endregion

                DateTime startTime = DateTime.Now;

                #region Parallel download

                int index = 0;
                Parallel.ForEach(readRanges, new ParallelOptions() { MaxDegreeOfParallelism = numberOfParallelDownloads }, readRange =>
                {
                    HttpWebRequest httpWebRequest = HttpWebRequest.Create(fileUrl) as HttpWebRequest;
                    httpWebRequest.Method = "GET";
                    httpWebRequest.AddRange(readRange.Start, readRange.End);
                    using (HttpWebResponse httpWebResponse = httpWebRequest.GetResponse() as HttpWebResponse)
                    {
                        String tempFilePath = Path.GetTempFileName();
                        using (var fileStream = new FileStream(tempFilePath, FileMode.Create, FileAccess.Write, FileShare.Write))
                        {
                            httpWebResponse.GetResponseStream().CopyTo(fileStream);
                            tempFilesDictionary.TryAdd((int)index, tempFilePath);
                        }
                    }
                    index++;

                });

                result.ParallelDownloads = index;

                #endregion

                result.TimeTaken = DateTime.Now.Subtract(startTime);

                #region Merge to single file
                foreach (var tempFile in tempFilesDictionary.OrderBy(b => b.Key))
                {
                    byte[] tempFileBytes = File.ReadAllBytes(tempFile.Value);
                    destinationStream.Write(tempFileBytes, 0, tempFileBytes.Length);
                    File.Delete(tempFile.Value);
                }
                #endregion


                return result;
            }


        }
    }
}

    
Note

The class and method are static and they are setting ServicePointManager class properties which is also static. That makes this class not thread safe

Now to give a test run to the code. Although I did not do a test with a large file, you can see the difference in download speed.

using System;

namespace Downloader.App
{
    class Program
    {
        static void Main(string[] args)
        {
            var result = Downloader.Download("http://dejanstojanovic.net/media/215073/optimize-jpg.zip", @"c:\temp\", 2);
            
            Console.WriteLine($"Location: {result.FilePath}");
            Console.WriteLine($"Size: {result.Size}bytes");
            Console.WriteLine($"Time taken: {result.TimeTaken.Milliseconds}ms");
            Console.WriteLine($"Parallel: {result.ParallelDownloads}");

            Console.ReadKey();
        }
    }
}

    

This is the output of the console running with single downloader option

Location: c:\temp\optimize-jpg.zip
Size: 307440bytes
Time taken: 486ms
Parallel: 1

Now running with 4 parallel download the results are the following

Location: c:\temp\optimize-jpg.zip
Size: 307440bytes
Time taken: 279ms
Parallel: 4

Not a big difference in time since the file is to small, but if you compare these two values you will see that improvement is over 50%.

Complete code and ready to debug project you can find in the download section of this article page.

References

Disclaimer

Purpose of the code contained in snippets or available for download in this article is solely for learning and demo purposes. Author will not be held responsible for any failure or damages caused due to any other usage.


About the author

DEJAN STOJANOVIC

Dejan is a passionate Software Architect/Developer. He is highly experienced in .NET programming platform includion ASP.NET MVC and WebApi. He likes working on new technologies and exciting challenging projects

CONNECT WITH DEJAN  Loginlinkedin Logintwitter Logingoogleplus Logingoogleplus

JavaScript

read more

SQL/T-SQL

read more

Umbraco CMS

read more

PowerShell

read more

Comments for this article