Java – reading, manipulating and writing WAV files

I read WAV files via an AudioInputStream. The following snippet from the Java Sound Tutorials works well.

int totalFramesRead = 0;
File fileIn = new File(somePathName);
// somePathName is a pre-existing string whose value was
// based on a user selection.
try {
  AudioInputStream audioInputStream = 
    AudioSystem.getAudioInputStream(fileIn);
  int bytesPerFrame = 
    audioInputStream.getFormat().getFrameSize();
    if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
    // some audio formats may have unspecified frame size
    // in that case we may read any amount of bytes
    bytesPerFrame = 1;
  } 
  // Set an arbitrary buffer size of 1024 frames.
  int numBytes = 1024 * bytesPerFrame; 
  byte[] audioBytes = new byte[numBytes];
  try {
    int numBytesRead = 0;
    int numFramesRead = 0;
    // Try to read numBytes bytes from the file.
    while ((numBytesRead = 
      audioInputStream.read(audioBytes)) != -1) {
      // Calculate the number of frames actually read.
      numFramesRead = numBytesRead / bytesPerFrame;
      totalFramesRead += numFramesRead;
      // Here, do something useful with the audio data that's 
      // now in the audioBytes array...
    }
  } catch (Exception ex) { 
    // Handle the error...
  }
} catch (Exception e) {
  // Handle the error...
}

To write a WAV, I found that quite tricky. On the surface it seems like a circular problem, the command that writes relies on an AudioInputStream as a parameter.

But how do you write bytes to an AudioInputStream? Shouldn’t there be an AudioOutputStream?

What I found was that one can define an object that has access to the raw audio byte data to implement TargetDataLine.

This requires a lot of methods be implemented, but most can stay in dummy form as they are not required for writing data to a file. The key method to implement is read(byte[] buffer, int bufferoffset, int numberofbytestoread).

As this method will probably be called multiple times, there should also be an instance variable that indicates how far through the data one has progressed, and update that as part of the above read method.

When you have implemented this method, then your object can be used in to create a new AudioInputStream which in turn can be used with:

AudioSystem.write(yourAudioInputStream, AudioFileFormat.WAV, yourFileDestination)

As a reminder, an AudioInputStream can be created with a TargetDataLine as a source.

As to the direct manipulating the data, I have had good success acting on the data in the buffer in the innermost loop of the snippet example above, audioBytes.

While you are in that inner loop, you can convert the bytes to integers or floats and multiply a volume value (ranging from 0.0 to 1.0) and then convert them back to little endian bytes.

I believe since you have access to a series of samples in that buffer you can also engage various forms of DSP filtering algorithms at that stage. In my experience I have found that it is better to do volume changes directly on data in this buffer because then you can make the smallest possible increment: one delta per sample, minimizing the chance of clicks due to volume-induced discontinuities.

I find the “control lines” for volume provided by Java tend to situations where the jumps in volume will cause clicks, and I believe this is because the deltas are only implemented at the granularity of a single buffer read (often in the range of one change per 1024 samples) rather than dividing the change into smaller pieces and adding them one per sample. But I’m not privy to how the Volume Controls were implemented, so please take that conjecture with a grain of salt.

All and all, Java.Sound has been a real headache to figure out. I fault the Tutorial for not including an explicit example of writing a file directly from bytes. I fault the Tutorial for burying the best example of Play a File coding in the “How to Convert…” section. However, there’s a LOT of valuable FREE info in that tutorial.


EDIT: 12/13/17

I’ve since used the following code to write audio from a PCM file in my own projects. Instead of implementing TargetDataLine one can extend InputStream and use that as a parameter to the AudioSystem.write method.

public class StereoPcmInputStream extends InputStream
{
    private float[] dataFrames;
    private int framesCounter;
    private int cursor;
    private int[] pcmOut = new int[2];
    private int[] frameBytes = new int[4];
    private int idx;
    
    private int framesToRead;

    public void setDataFrames(float[] dataFrames)
    {
        this.dataFrames = dataFrames;
        framesToRead = dataFrames.length / 2;
    }
    
    @Override
    public int read() throws IOException
    {
        while(available() > 0)
        {
            idx &= 3; 
            if (idx == 0) // set up next frame's worth of data
            {
                framesCounter++; // count elapsing frames

                // scale to 16 bits
                pcmOut[0] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
                pcmOut[1] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
            
                // output as unsigned bytes, in range [0..255]
                frameBytes[0] = (char)pcmOut[0];
                frameBytes[1] = (char)(pcmOut[0] >> 8);
                frameBytes[2] = (char)pcmOut[1];
                frameBytes[3] = (char)(pcmOut[1] >> 8);
            
            }
            return frameBytes[idx++]; 
        }
        return -1;
    }

    @Override 
    public int available()
    {
        // NOTE: not concurrency safe.
        // 1st half of sum: there are 4 reads available per frame to be read
        // 2nd half of sum: the # of bytes of the current frame that remain to be read
        return 4 * ((framesToRead - 1) - framesCounter) 
                + (4 - (idx % 4));
    }    

    @Override
    public void reset()
    {
        cursor = 0;
        framesCounter = 0;
        idx = 0;
    }

    @Override
    public void close()
    {
        System.out.println(
            "StereoPcmInputStream stopped after reading frames:" 
                + framesCounter);
    }
}

The source data to be exported here is in the form of stereo floats ranging from -1 to 1. The format of the resulting stream is 16-bit, stereo, little-endian.

I omitted skip and markSupported methods for my particular application. But it shouldn’t be difficult to add them if they are needed.

Leave a Comment