Fastest way to sum integers in text file

Your main bottleneck will be file IO. Parsing and adding up the numbers should not contribute to the algorithm as that can be done in a separate thread while the File I/O is waiting for the disk.

Some years ago I researched how to read from files in the fastest possible way and came across some excellent advice – which I implemented as a Scan routine as below:

// 4k buffer size.
static final int SIZE = 4 * 1024;
static byte[] buffer = new byte[SIZE];

// Fastest because a FileInputStream has an associated channel.
private static void ScanDataFile(Hunter p, FileInputStream f) throws FileNotFoundException, IOException {
    // Use a mapped and buffered stream for best speed.
    // See: http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly
    final FileChannel ch = f.getChannel();
    long red = 0L;
    do {
        final long read = Math.min(Integer.MAX_VALUE, ch.size() - red);
        final MappedByteBuffer mb = ch.map(FileChannel.MapMode.READ_ONLY, red, read);
        int nGet;
        while (mb.hasRemaining() && p.ok()) {
            nGet = Math.min(mb.remaining(), SIZE);
            mb.get(buffer, 0, nGet);
            for (int i = 0; i < nGet && p.ok(); i++) {
                p.check(buffer[i]);
                //size += 1;
            }
        }
        red += read;
    } while (red < ch.size() && p.ok());
    // Finish off.
    p.close();
    ch.close();
    f.close();
}

You may wish to adjust this technique before testing it for speed as it is making use of an interfaced object called a Hunter to hunt for the data.

As you can see the advice was derived in 2008 and there have been many enhancements to Java since then so this may not provide an improvement.

Added

I have not tested this but this should fit into your tests and use the same technique:

class Summer {

    long sum = 0;
    long val = 0;

    public void add(byte b) {
        if (b >= '0' && b <= '9') {
            val = (val * 10) + (b - '0');
        } else {
            sum += val;
            val = 0;
        }
    }

    public long getSum() {
        return sum + val;
    }
}

private long sumMapped() throws IOException {
    Summer sum = new Summer();
    FileInputStream f = new FileInputStream(file);
    final FileChannel ch = f.getChannel();
    long red = 0L;
    do {
        final long read = Math.min(Integer.MAX_VALUE, ch.size() - red);
        final MappedByteBuffer mb = ch.map(FileChannel.MapMode.READ_ONLY, red, read);
        int nGet;
        while (mb.hasRemaining()) {
            nGet = Math.min(mb.remaining(), SIZE);
            mb.get(buffer, 0, nGet);
            for (int i = 0; i < nGet; i++) {
                sum.add(buffer[i]);
            }
        }
        red += read;
    } while (red < ch.size());
    // Finish off.
    ch.close();
    f.close();
    return sum.getSum();
}

Leave a Comment