Java Synthesizer, Part 2 – Sound Engine

Ok, I was going to hold off on writing this section on the sound engine until later, when I had it fully working. However, it’s now close enough as to justify at least talking about it. I was also going to go into a long discussion of why I made various choices, but the code is long enough as is, and there’s no point to prolonging things.

There are three parts of a software synth, in my opinion, that are needed right at the beginning – a working oscillator (to make something to hear for verifying that everything else works), the ADSR (to show that things are working in real-time) and the part that delivers the sound waveform to the speakers.  Arguably, the ADSR can be left to later, but there’s no point to having the speaker section working if there’s no tone to play, and vice versa. So, the speaker section and the oscillator kind of went hand-in-hand.

If we look at Dick Baldwin’s Java audio tutorials, specifically the one on synthesized sounds, we’ll see that most of what we need is right there. The method playOrFileData() sets up a source data line to the speakers; listenThread() takes a pre-built waveform and sends it out through the source data line, and the synGen class is what pre-builds the waveform to be played. The problem with this arrangement, for my purposes, is that it makes a 2-second long waveform and then plays it.  It’s not real-time and it’s not particularly responsive.  (I don’t want to press a keyboard key and then wait 2 seconds to be able to press the next key.)

For convenience sake, I’m going to call the following code the “sound engine”. It will consist of a listener that runs permanently in the background waiting for something to play, a timer to call the method for making 10-20ms sound slices, the slice method itself, a method for starting the listener, and a method to stop it when the program exits.

The listener start method needs to determine the audio format we’re using (sample rate, number of channels (1 or 2), sample size (8 or 16 bits) and serial bit order), and then use that to open a data line, which then is used for opening a source data line.  The last step is to launch the listener.  We need to create some supporting variables as well. (Again, I apologize for WordPress’ stripping out of formatting. A link to the formatted textfile with the code fragment can be found at the end of this entry.

final private static int SAMPLERATE_16K = 16000;

float        sampleRate           = SAMPLERATE_16K;
int          sampleSizeInBits     = 16;                        //Allowable 8,16
int          channels             = 1;                         //Allowable 1,2
boolean      signed               = true;                      //Allowable true,false
boolean      bigEndian            = true;                      //Allowable true,false
boolean      stopS2SListener      = false;
boolean      gotData              = false;

private void startSend2SpeakersListener() {
try {
InputStream baStream       = new ByteArrayInputStream(audioData);
audioFormat                = new AudioFormat(sampleRate, sampleSizeInBits, channels, signed, bigEndian);
audioInputStream           = new AudioInputStream(baStream, audioFormat, audioData.length/audioFormat.getFrameSize());
DataLine.Info dataLineInfo = new DataLine.Info(SourceDataLine.class, audioFormat);
speakerLine                = (SourceDataLine) AudioSystem.getLine(dataLineInfo);;
new sendToSpeakers().start();                                           // Start listener
catch(Exception ex) {
jTextArea1.append(“startPlayDataListener Exception: \n” + ex + “\n”);

In Baldwin’s tutorial, he creates a 64K buffer array, which, because it has a fixed length, ensures that the listener will take 2 seconds to dump and play back the sound. Since I’m using a 16000 sampling rate, and running between 10 and 20ms of data at a time, the array can be 320 or 640 bytes in size. This part needs tweaking, but the idea is that while one slice of waveform is being dumped to the speakers, I want to be building up the next slice to have it ready when needed, to avoid data run-outs (running out of data too quickly, which results in clicking noises). So, I’m using a double buffer and alternating between them. I’m trying a staggered approach, where the first slice of waveform is 10ms, and then all subsequent slices are 20ms each. It works in principle, but I’m kind of guessing at the buffer sizes and I really need to sit down and make sure I’m doing this just right.

First, start the listener.

public adsrUI() {
startSend2SpeakersListener();                                              // Get the play data listener running in the background

// Listener that does the actual work of sending data to the speakers.

class sendToSpeakers extends Thread {
byte playBuffer[] = new byte[640];  // Was 16384

public void run(){
int cnt;

// playBuffer[] is fixed length, meaning that there’s no real clue that this listener has reached the end when sending data to the
// audioInputStream buffer. To get around this issue, I’ll use the gotData flag to show when new data is ready for buffering. Then,
// I need to mark the first byte of the audioInputStream buffer in order to have .reset() return to it at the end of buffering.
// One important point to remember is that the listener is running non-stop in the background and will keep trying to play whatever
// was last in the buffer if we just reset it to the beginning using the .reset() method. So, I need to make gotData false as well.

while(! stopS2SListener) {                                       // Keep running until program exit
if(gotData) {
audioInputStream.mark(0);                                          // Mark the beginning of the input buffer

while((cnt =, 0, playBuffer.length)) != -1){  // Read to the buffer until end of new data
if(cnt > 0){
speakerLine.write(playBuffer, 0, cnt);                    // Send buffered data to speakers.
gotData = false;                                                // Housekeeping.
audioInputStream.reset();                                       // Reset playback buffer pointer to mark point.

} catch (Exception ex) {
jTextArea1.append(“sendToSpeakers Exception: \n” + ex + “\n”);

I talked about timers in the K-Gater blog series. I’m just setting up a simple timer and using a counter to determine how many milliseconds have past. Then I call the slice generator at 10ms or 20ms intervals.

int tmr = -1;
int maxTmr = 10;
timerExec    ttd                = new timerExec();                        // Executable timer function
Timer        masterTimer        = new Timer();                            // Master timer object

class timerExec extends TimerTask {                        // The heart of the timer
public void run() {
if(tmr > -1) {
if(tmr >= maxTmr) {
tmr = 0;

And here’s where I do the actual work of making the waveform to be played, slice by slice. I’ll be talking about this section a lot in the future. But, basically, the idea is as I mentioned above. I want the double buffers to be stagered, so while one is dumping data to the speakers, the other is getting a new waveform slice. So, the first buffer takes 10 ms of data, and then they’re all 20 ms long.  I use the variable gotData to tell the listener to check the buffers for something to play (and the listener sets it to false at the end of dumping each slice). Otherwise, the only real magic is in the for-loop, and I need to wait until I talk about the oscillator to get into any details.

byte []   audioData   = new byte[640];   // was 16000
byte []   audioData1  = new byte[640];
byte []   audioData2  = new byte[640];
byte [][]   audioBuffer  = {audioData1, audioData2};
int audioBufferPtr       = 0;

private void addBuffer() {
ByteBuffer byteBuffer;
ShortBuffer shortBuffer;
byteBuffer       = ByteBuffer.wrap(audioBuffer[audioBufferPtr]);
shortBuffer      = byteBuffer.asShortBuffer();

// Put code for making new waveforms here.

for(int pCnt = 0; pCnt < 320; pCnt++){
shortBuffer.put( (short)( // Put code for making new waveform here. // ) );

System.arraycopy(audioBuffer[audioBufferPtr], 0, audioData, 0, 640);
audioBufferPtr = (audioBufferPtr == 0) ? 1 : 0;
gotData = true;
if(maxTmr == 10) maxTmr = 20;
if(adsr1.mode == -1) tmr = -1;

The last part is for shutting down the listener when we exit the app.

private void stopSend2SpeakersListener() {
speakerLine.drain();                                                      // Ensure the source buffer is empty.
speakerLine.stop();                                                       // Stop and close the source buffer.
stopS2SListener = true;                                                   // Cause the playback listener to exit.

As mentioned at the beginning, I get clicks when the waveform amplitude or frequency changes too quickly. I think this is due to the low sampling rate. At 16,000 samples/second, I can only produce a 8 KHz wave anyway, and since the human ear can sense up to at least 12 KHz, I’m planning eventually to go to a 22,050 or 44,100 rate.

Raw java code fragment here.

Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: