Read large files in Java

To save memory, do not unnecessarily store/duplicate the data in memory (i.e. do not assign them to variables outside the loop). Just process the output immediately as soon as the input comes in.

It really doesn’t matter whether you’re using BufferedReader or not. It will not cost significantly much more memory as some implicitly seem to suggest. It will at highest only hit a few % from performance. The same applies on using NIO. It will only improve scalability, not memory use. It will only become interesting when you’ve hundreds of threads running on the same file.

Just loop through the file, write every line immediately to other file as you read in, count the lines and if it reaches 100, then switch to next file, etcetera.

Kickoff example:

String encoding = "UTF-8";
int maxlines = 100;
BufferedReader reader = null;
BufferedWriter writer = null;

try {
    reader = new BufferedReader(new InputStreamReader(new FileInputStream("/bigfile.txt"), encoding));
    int count = 0;
    for (String line; (line = reader.readLine()) != null;) {
        if (count++ % maxlines == 0) {
            close(writer);
            writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("/smallfile" + (count / maxlines) + ".txt"), encoding));
        }
        writer.write(line);
        writer.newLine();
    }
} finally {
    close(writer);
    close(reader);
}

Leave a Comment