Playing audio is about reproducing a former analog signal as good as possible. The analog signal is sampled in a fixed interval, this is the reason why the values usually called samples. Read more about this topic here.

The sampling interval is usually specified as sampling frequency. A sampling frequency of 22.1kHz means, there are 22100 samples per second. Another key attribute is the sampling depth, which is specified in bits.

I use a sampling frequency of 22.1kHz with 16 bit samples. Because the DAC only supports 12 bits, I simple ignore the lowest 4 bits of the audio material.

The Interface

class AudioPlayer
{
public:
    bool initialize();
    bool play(uint32_t startBlock, uint32_t sampleCount);
    bool play(const char *fileName);
};

The method initialize() initialises the SD card and the DAC. It also reads the directory from the SD card.

There are two play() methods. One can be used if you already know the start block of the audio data and the number of samples, the second one searches the file in the directory and calculates the number of samples from the file size. Both methods are blocking while playing the audio and return true on success, or false if there was any error.

/// The global instance of the audio player.
///
extern AudioPlayer audioPlayer;

You should use the given global instance of the player.

The Implementation

The audio player combines the SD card class and the DAC class to play the audio data. In the initialisation method, the DAC is initialised and a call of shutdown() sets its initial state. In a second step, the SD card is initialised and the directory is read from the SD card.

bool AudioPlayer::initialize()
{
    // Initialize the DAC
    dacPort.initialize();
    
    // Shutdown the output
    dacPort.shutdown();
    
    // Initialize the SDCard    
    SDCard::Status status = sdCard.initialize();    
    if (status != SDCard::StatusReady) {
#ifdef AUDIOPLAYER_DEBUG
        Serial.println(String(F("SD Card Init Failure, error="))+String(sdCard.error()));
        Serial.flush();
#endif
        return false;
    }

#ifdef AUDIOPLAYER_DEBUG
    Serial.println(String(F("SD Card Init Success.")));
    Serial.flush();
#endif

    // Maximum speed
    SPI.beginTransaction(SPISettings(32000000, MSBFIRST, SPI_MODE0));

    // Read the directory.
    status = sdCard.readDirectory();
    if (status != SDCard::StatusReady) {
#ifdef AUDIOPLAYER_DEBUG
        Serial.println(String(F("SD Card read directory failure, error="))+String(sdCard.error()));
        Serial.flush();
#endif
        return false;
    }

    SPI.endTransaction();

    return true;
}

Playing Audio

There is the first play method, which takes a file name. It just searches for this file name in the directory and on success it is calling the second play method with the correct start block and sample count.

bool AudioPlayer::play(const char *fileName)
{
    const SDCard::DirectoryEntry *entry = sdCard.findFile(fileName);
    if (entry != 0) {
        return play(entry->startBlock, entry->fileSize/2);
    } else {
        return false;
    }
}

The play method starts by setting the SPI transaction and calling startMultiRead() to start the read process. This is an asynchronous method call, which is just called in a loop until it fails or succeeds.

bool AudioPlayer::play(uint32_t startBlock, uint32_t sampleCount)
{
    SDCard::Status status;

    // Maximum speed
    SPI.beginTransaction(SPISettings(32000000, MSBFIRST, SPI_MODE0));
    
    // Wait until we can start a read.
    while((status = sdCard.startMultiRead(startBlock)) == SDCard::StatusWait) {
        delayMicroseconds(1);
    }
    
    // Check for any errors.
    if (status != SDCard::StatusReady) {
#ifdef AUDIOPLAYER_DEBUG
        Serial.println(String(F("Start Read Failure, error="))+String(sdCard.error()));
        Serial.flush();
#endif
        SPI.endTransaction();
        return false;
    }

In a second step a temporary sample buffer is allocated on the stack, with a size of 256 samples. This is enough for a wide range of SD cards. The critical phase is always the waiting time between the blocks. The sample buffer has to be big enough to get through this time. If you use a really fast card, you can try to make this buffer smaller to free memory for other things.

Fast reading is started by calling the startFastRead() method and the created sample buffer is filled with initial data from the SD card.

    // Create a buffer with a given number of samples.
    const uint16_t bufferSize = 0x100; // the size of the buffer.
    uint16_t sampleBuffer[bufferSize]; // The buffer (created on the stack).

    // Fill the initial audio buffer.
    sdCard.startFastRead();
    for (uint16_t sampleIndex = 0; sampleIndex < bufferSize; ) {
        uint8_t* const bufferPointer =
            reinterpret_cast<uint8_t*>(&sampleBuffer[sampleIndex]);
        status = sdCard.readFast4(bufferPointer);
        if (status == SDCard::StatusReady) {
            sampleIndex += 2;
        } else if (status == SDCard::StatusError) {
            SPI.endTransaction();
            return false;
        }
    }

Hardware timer one is used to get a very precise timing. The timer speed is set to no pre-scaling, where the timer runs with half of the clock speed. For Arduino Uno this is 8 MHz. Register ICR1 is configured as TOP, which means the timer will start with the value in register ICR1 and is counting down to 0. If the timer reaches 0, it will set the overflow flag, and start over from the value in ICR1 again. Have a look on the data sheet of the ATmega328 for details.

    // Initialize the Timer
    uint8_t oldSREG = SREG;             
    cli();
    TCCR1A = 0;
    // no pre-scaling, use ICR1 as TOP
    TCCR1B = _BV(CS10)|_BV(WGM13);
    // Set the TOP value.
    uint16_t timerTop = (F_CPU / 2 / 22050); // number of clocks for 22.5 kHz
    ICR1 = timerTop;
    TIMSK1 = 0; // no interrupts from timer
    SREG = oldSREG;

Next I define some variables I need to play the samples. The currentSample variable is a unsigned 32 bit integer which keeps track of the current sample which was played. It is used for two purposes: The first purpose is to detect the end of the audio data, the second purpose is to get the read position in the audio buffer.

The audio buffer is implemented as round buffer, if the read reaches the end of the buffer it is starting over at the begin of the buffer. The buffer has exact 0x100 samples, and if I just mask the currentSample with 0xFF, I get always the correct read position. I do this using a constant sampleBufferMask which I use to mask the variable. It looks more complicated in the code as the resulting machine code looks like. In the machine code simple the lowest byte of the currentSample variable is used, no time lost with masking at all.

To keep track of the “fill level” of the buffer, there is the variable bufferedSamples. It is initialised with 0x100, because we filled the whole buffer with samples before. Each time a sample is played, this value is decremented by one. Each time four bytes are read, this variable is incremented by 2 (samples = 4 bytes).

This variable is also used as control flow. If the buffer would reach 0, sample playing would stop. If the buffer has enough data (more than buffer size – 2 samples), no data is read from the SD card.

The dacValue variable is used to prepare the 12bit value for the DAC. As last step before the main loop starts, the DAC is set to the value 0x0800, which is the middle position. This is necessary, because the loop starts actually by pushing the value to the output. This call is necessary to set an initial value which is pushed to the output.

    // Variables to play the sound.
    uint32_t currentSample = 0; // The absolute sample position
    // The mask to get the sample buffer from the current sample value.
    const uint32_t sampleBufferMask = 0x000000ff;
    // The number of buffered samples. (start at 100%)
    uint16_t bufferedSamples = 0x100;
    // The DAC value, used for fade out.
    uint16_t dacValue;
    
    // Enable the DAC in "middle" position. (TODO: fade in).
    dacPort.setValue(0x0800);

The main loop starts with for (;;) which will repeat forever, until break is called.

    // Main loop, until sample count reached.
    for (;;) {

It starts by waiting for the overflow flag of timer one. The check is very fast, and the precision is therefore near the precision of the timer, which is 0.125 microseconds. If the timer overflows, the latch of the DAC is pulled low, which pushes the previous value to the output of the DAC. Next the overflow flag is reset (by setting the flag to 1).

        // 1. Wait for the timer, and push the DAC register.
        // This is done at the begin to play the samples as close to the 
        // sample frequency as possible.  
        while ((TIFR1 & _BV(TOV1)) == 0) {
        }
        dacPort.pushValue(); // Set the DAC output.
        TIFR1 |= _BV(TOV1); // reset the timer flag.

Next we check if the buffer contains more than 0 buffered samples. If it does, we read the next sample, shift it by 4 bits to the right to convert it to the required 12 bits and send it to the DAC, as prepared next value.

We decrement the number of buffered samples --bufferedSamples, increment the sample counter ++currentSample and check if we reached the end of the audio data. If we did, we break the loop. The last sample will never get played, but this should not be a big problem.

        // 2. Read the next sample and write it to the DAC.
        if (bufferedSamples > 0) { // Check if we have buffered samples.
            const uint16_t sample = sampleBuffer[(currentSample & sampleBufferMask)];
            dacValue = (sample >> 4);
            dacPort.setValue(dacValue);
            // Move the current sample pointer.
            --bufferedSamples;
            // 3. Increase the sample counter and check for the end.
            if (++currentSample > sampleCount) {
                break;
            }
        }

Next we calculate the current read position, and derive the current writing position from it. I use const variables at this place, which the compiler will most likely implement by using registers.

If there is enough space in the buffer to read four bytes, readFast4 is called to try reading more data from the SD card. If there is data to read, the bufferedSamples variable is incremented by 2 samples (= 4 bytes).

        // 4. Refill the sample buffer if necessary and possible
        const uint16_t currentReadPos = (currentSample & sampleBufferMask);
        const uint16_t currentWritePos = (currentReadPos + bufferedSamples) & sampleBufferMask;
        // Only write if there is space, and never at the end of the buffer.
        if (bufferedSamples < (bufferSize - 4)) {
            // try to read
            uint8_t* const writePointer =
                reinterpret_cast<uint8_t*>(&sampleBuffer[currentWritePos]);
            status = sdCard.readFast4(writePointer);
            if (status == SDCard::StatusError) {
                goto readError;
            } else if (status != SDCard::StatusWait) {
                bufferedSamples += 2; // (4 bytes)
            }
        }
    }

After the loop, the timer is stopped and the DAC is shutdown. The SPI transaction is closed and the method returns true for success. The error case stops the timer and shutdowns the DAC as well.

    // Keep it in the middle position.
    dacPort.setValue(0x800);
    dacPort.pushValue();
    
    // Stop the timer.
    TCCR1B &= ~(_BV(CS10)|_BV(CS11)|_BV(CS12));          

    // Stop reading from the SD Card.
    sdCard.stopRead();

    // Shutdown the output
    dacPort.shutdown();

    // End SPI transaction.
    SPI.endTransaction();

    return true; // success

readError:
    // Stop the timer.
    TCCR1B &= ~(_BV(CS10)|_BV(CS11)|_BV(CS12));          

    // Shutdown the output
    dacPort.shutdown();

    // End SPI transaction.
    SPI.endTransaction();

    return false; // error
}

Conclusion

The audio player is simple to understand and is working well for the limited resources. There is an audible crack at the begin and end, because the DAC output is 0V if shutdown, but while playing samples the “middle” level is around 2V. Here I should either implement a soft fade in/out by software, or find a hardware solution where the voltage level changed or the cracks suppressed.

Continue read: Handling of the Loop