Reading large text files with streams in C#

You can improve read speed by using a BufferedStream, like this:

using (FileStream fs = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (BufferedStream bs = new BufferedStream(fs))
using (StreamReader sr = new StreamReader(bs))
{
    string line;
    while ((line = sr.ReadLine()) != null)
    {

    }
}

March 2013 UPDATE

I recently wrote code for reading and processing (searching for text in) 1 GB-ish text files (much larger than the files involved here) and achieved a significant performance gain by using a producer/consumer pattern. The producer task read in lines of text using the BufferedStream and handed them off to a separate consumer task that did the searching.

I used this as an opportunity to learn TPL Dataflow, which is very well suited for quickly coding this pattern.

Why BufferedStream is faster

A buffer is a block of bytes in memory used to cache data, thereby reducing the number of calls to the operating system. Buffers improve read and write performance. A buffer can be used for either reading or writing, but never both simultaneously. The Read and Write methods of BufferedStream automatically maintain the buffer.

December 2014 UPDATE: Your Mileage May Vary

Based on the comments, FileStream should be using a BufferedStream internally. At the time this answer was first provided, I measured a significant performance boost by adding a BufferedStream. At the time I was targeting .NET 3.x on a 32-bit platform. Today, targeting .NET 4.5 on a 64-bit platform, I do not see any improvement.

Related

I came across a case where streaming a large, generated CSV file to the Response stream from an ASP.Net MVC action was very slow. Adding a BufferedStream improved performance by 100x in this instance. For more see Unbuffered Output Very Slow

Leave a Comment