Learning Core Audio: A Hands-On Guide to Audio Programming for Mac and iOS

Author: Chris Adamson, Kevin Avila
All Stack Overflow 15
This Year Stack Overflow 5
This Month Stack Overflow 5

Learning Core Audio: A Hands-On Guide to Audio Programming for Mac and iOS


Review Date:


by anonymous   2017-08-20

Not sure how much of an answer this is, but there will be too much text and links for a comment and hopefully it will help (maybe guide you to your answer).

First off I know with my current project adjusting the sample rate will effect the speed of the sound, so you can try to play with those settings. But 44k is what I see in most default implementation including the apple example SpeakHere. However I would spend some time comparing your code to that example because there are quite a few differences. like checking before enqueueing.

First check out this posting https://stackoverflow.com/a/4299665/530933 It talks about how you need to know the audio format, specifically how many bytes in a frame, and casting appropriately

also good luck. I have had quite a few questions posted here, apple forums, and the ios forum (not the official one). With very little responses/help. To get where I am today (audio recording & streaming in ulaw) I ended up having to open an Apple Dev Support Ticket. Which prior to tackling the audio I never knew existed (dev support). One good thing is that if you have a valid dev account you get 2 incidents for free! CoreAudio is not fun. Documentation is sparse, and besides SpeakHere there are not many examples. One thing I did find is that the framework headers do have some good info and this book. Unfortunately I have only started the book otherwise I may be able to help you further.

You can also check some of my own postings which I have tried to answer to the best of my abilities. This is my main audio question which I have spent alot of time on to compile all pertinent links and code.

using AQRecorder (audioqueue recorder example) in an objective c class

trying to use AVAssetWriter for ulaw audio (2)

by anonymous   2017-08-20

I will answer your second question first - don't wait for the app to crash, you can stop pulling audio from the track by checking the number of samples that are available in the CMSampleBufferRef you are reading; for example (this code will also be included in the 2nd half of my answer):

CMSampleBufferRef sample;
sample = [readerOutput copyNextSampleBuffer];

CMItemCount numSamples = CMSampleBufferGetNumSamples(sample);

if (!sample || (numSamples == 0)) {
  // handle end of audio track here

Regarding your first question, it depends on the type of audio you are grabbing - it could be wither PCM (non-compressed) or VBR (compressed) format. I'm not even going to bother addressing the PCM part because it's simply not smart to send uncompressed audio data from one phone to another over the network - it's unnecessarily expensive and will clog your networking bandwidth. So we're left with VBR data. For that you've got to send the contents of AudioBuffer and AudioStreamPacketDescription you pulled from the sample. But then again, it's probably best to explain what I'm saying by code:

    [broadcastLock lock];

CMSampleBufferRef sample;
sample = [readerOutput copyNextSampleBuffer];

CMItemCount numSamples = CMSampleBufferGetNumSamples(sample);

if (!sample || (numSamples == 0)) {
    Packet *packet = [Packet packetWithType:PacketTypeEndOfSong];
    packet.sendReliably = NO;
    [self sendPacketToAllClients:packet];
    [sampleBroadcastTimer invalidate];

        NSLog(@"SERVER: going through sample loop");
        Boolean isBufferDataReady = CMSampleBufferDataIsReady(sample);

        CMBlockBufferRef CMBuffer = CMSampleBufferGetDataBuffer( sample );                                                         
        AudioBufferList audioBufferList;  

                   "could not read sample data");

        const AudioStreamPacketDescription   * inPacketDescriptions;

        size_t                               packetDescriptionsSizeOut;
        size_t inNumberPackets;

                   "could not read sample packet descriptions");

        inNumberPackets = packetDescriptionsSizeOut/sizeof(AudioStreamPacketDescription);

        AudioBuffer audioBuffer = audioBufferList.mBuffers[0];

        for (int i = 0; i < inNumberPackets; ++i)

            NSLog(@"going through packets loop");
            SInt64 dataOffset = inPacketDescriptions[i].mStartOffset;
            UInt32 dataSize   = inPacketDescriptions[i].mDataByteSize;            

            size_t packetSpaceRemaining = MAX_PACKET_SIZE - packetBytesFilled - packetDescriptionsBytesFilled;
            size_t packetDescrSpaceRemaining = MAX_PACKET_DESCRIPTIONS_SIZE - packetDescriptionsBytesFilled;        

            if ((packetSpaceRemaining < (dataSize + AUDIO_STREAM_PACK_DESC_SIZE)) || 
                (packetDescrSpaceRemaining < AUDIO_STREAM_PACK_DESC_SIZE))
                if (![self encapsulateAndShipPacket:packet packetDescriptions:packetDescriptions packetID:assetOnAirID])

            memcpy((char*)packet + packetBytesFilled, 
                   (const char*)(audioBuffer.mData + dataOffset), dataSize);

            memcpy((char*)packetDescriptions + packetDescriptionsBytesFilled, 
                   [self encapsulatePacketDescription:inPacketDescriptions[i]

            packetBytesFilled += dataSize;
            packetDescriptionsBytesFilled += AUDIO_STREAM_PACK_DESC_SIZE; 

            // if this is the last packet, then ship it
            if (i == (inNumberPackets - 1)) {          
                NSLog(@"woooah! this is the last packet (%d).. so we will ship it!", i);
                if (![self encapsulateAndShipPacket:packet packetDescriptions:packetDescriptions packetID:assetOnAirID])



    [broadcastLock unlock];

Some methods that I've used in the above code are methods you don't need to worry about, such as adding headers to each packet (I was creating my own protocol, you can create your own). For more info see this tutorial.

- (BOOL)encapsulateAndShipPacket:(void *)source
              packetDescriptions:(void *)packetDescriptions
                        packetID:(NSString *)packetID

    // package Packet
    char * headerPacket = (char *)malloc(MAX_PACKET_SIZE + AUDIO_BUFFER_PACKET_HEADER_SIZE + packetDescriptionsBytesFilled);

    appendInt32(headerPacket, 'SNAP', 0);    
    appendInt32(headerPacket,packetNumber, 4);    
    appendInt16(headerPacket,PacketTypeAudioBuffer, 8);   
    // we use this so that we can add int32s later
    UInt16 filler = 0x00;
    appendInt16(headerPacket,filler, 10);    
    appendInt32(headerPacket, packetBytesFilled, 12);
    appendInt32(headerPacket, packetDescriptionsBytesFilled, 16);    
    appendUTF8String(headerPacket, [packetID UTF8String], 20);

    int offset = AUDIO_BUFFER_PACKET_HEADER_SIZE;        
    memcpy((char *)(headerPacket + offset), (char *)source, packetBytesFilled);

    offset += packetBytesFilled;

    memcpy((char *)(headerPacket + offset), (char *)packetDescriptions, packetDescriptionsBytesFilled);

    NSData *completePacket = [NSData dataWithBytes:headerPacket length: AUDIO_BUFFER_PACKET_HEADER_SIZE + packetBytesFilled + packetDescriptionsBytesFilled];        

    NSLog(@"sending packet number %lu to all peers", packetNumber);
    NSError *error;    
    if (![_session sendDataToAllPeers:completePacket withDataMode:GKSendDataReliable error:&error])   {
        NSLog(@"Error sending data to clients: %@", error);

    Packet *packet = [Packet packetWithData:completePacket];

    // reset packet 
    packetBytesFilled = 0;
    packetDescriptionsBytesFilled = 0;

    //  free(packet); free(packetDescriptions);
    return YES;


- (char *)encapsulatePacketDescription:(AudioStreamPacketDescription)inPacketDescription
    // take out 32bytes b/c for mStartOffset we are using a 32 bit integer, not 64
    char * packetDescription = (char *)malloc(AUDIO_STREAM_PACK_DESC_SIZE);

    appendInt32(packetDescription, (UInt32)mStartOffset, 0);
    appendInt32(packetDescription, inPacketDescription.mVariableFramesInPacket, 4);
    appendInt32(packetDescription, inPacketDescription.mDataByteSize,8);    

    return packetDescription;

receiving data:

- (void)receiveData:(NSData *)data fromPeer:(NSString *)peerID inSession:(GKSession *)session context:(void *)context

    Packet *packet = [Packet packetWithData:data];
    if (packet == nil)
         NSLog(@"Invalid packet: %@", data);

    Player *player = [self playerWithPeerID:peerID];

    if (player != nil)
        player.receivedResponse = YES;  // this is the new bit
    } else {
        Player *player = [[Player alloc] init];
        player.peerID = peerID;
        [_players setObject:player forKey:player.peerID];

    if (self.isServer)
        [Logger Log:@"SERVER: we just received packet"];   
        [self serverReceivedPacket:packet fromPlayer:player];

        [self clientReceivedPacket:packet];


  1. There are a lot of networking details that I didn't cover here (ie, in the receiving data part. I used a lot of custom made objects without expanding on their definition). I didn't because explaining all of that is beyond the scope of just one answer on SO. However, you can follow the excellent tutorial of Ray Wenderlich. He takes his time in explaining networking principles, and the architecture I use above is almost taken verbatim from him. HOWEVER THERE IS A CATCH (see next point)

  2. Depending on your project, GKSession may not be suitable (especially if your project is realtime, or if you need more than 2-3 devices to connect simultaneously) it has a lot of limitations. You will have to dig down deeper and use Bonjour directly instead. iPhone cool projects has a nice quick chapter that gives a nice example of using Bonjour services. It's not as scary as it sounds (and the apple documentation is kinda overbearing on that subject).

  3. I noticed you use GCD for your multithreading. Again, if you are dealing with real time then you don't want to use advanced frameworks that do the heavy lifting for you (GCD is one of them). For more on this subject read this excellent article. Also read the prolonged discussion between me and justin in the comments of this answer.

  4. You may want to check out MTAudioProcessingTap introduced in iOS 6. It can potentially save you some hassle while dealing with AVAssets. I didn't test this stuff though. It came out after I did all my work.

  5. Last but not least, you may want to check out the learning core audio book. It's a widely acknowledged reference on this subject. I remember being as stuck as you were at the point you asked the question. Core audio is heavy duty and it takes time to sink in. SO will only give you pointers. You will have to take your time to absorb the material yourself then you will figure out how things work out. Good luck!

by anonymous   2017-08-20

You can find an example that uses this ring buffer if you download the example code of the book Learning Core Audio here (under the downloads tab). Jump to the chapter 8 example in a folder called CH08_AUGraphInput.

However, if you are simply reading audio from a file, then using an (extra) ring buffer seems like an overkill.. A ring buffer comes in handy when you are having real time (or near real time) input and output (read chapter 8 in the said book for a more detailed explanation of when a ring buffer is necessary.. note that the example in chapter 8 is about playing audio immediately after recording it by a mic, which isn't what you want to do).

The reason why I said extra ring buffer, is because in core Audio there is already an audio Queue (which can be thought of as a ring buffer.. or at least it in your case it replaces the need for a ring buffer: you populate it with data, it plays the data, then fires a callback that informs you that the data you supplied has been played). The apple documentation offers a good explanation on this one.

In your case, if you are simply reading audio from a file, then you can easily control the throughput of the audio from the file. You can pause it by blocking the thread that reads data from the audio file for example.

For a simple example of what I'm talking about, see this example I created on github. For a more advanced example, see Matt Gallagher's famous example.

by anonymous   2017-08-20

I think you're right - in this case a 'direct-from-disk' buffering approach is probably what you need. I believe the correct AudioUnit subtype is AudioFilePlayer. From the documentation:

The unit reads and converts audio file data into its own internal buffers. It performs disk I/O on a high-priority thread shared among all instances of this unit within a process. Upon completion of a disk read, the unit internally schedules buffers for playback.

A working example of using this unit on Mac OS X is given in Chris Adamson's book Learning Core Audio. The code for iOS isn't much different, and is discussed in this thread on the CoreAudio-API mailing list. Adamson's working code example can be found here. You should be able to adapt this to your requirements.

by anonymous   2017-08-20

If you're serious about learning Core Audio, do yourself a favour and get this book. It got me started, and Core Audio is not easy by all means! http://www.amazon.com/Learning-Core-Audio-Hands-Programming/dp/0321636848