Audio Analysis Architecture Based On JLayer

As mentioned last week, my next project revolves around audio analysis. The first step is acquiring data and for that,  I had already found the perfect Java solution. JLayer makes it easy to obtain data, but a sound file contains very large amounts of it. This post goes into a basic architecture to tame that data and get it into a form that can be processed.

downloadDownload the NetBeans example project

There are a few ways to get the wave data out of an mp3 file using JLayer. You could use the MP3SPI driver to play data through the JavaSound API and capture the bytes. But there’s no reason to make it that complicated.

JLayer can stream data to its own AudioDevice class. This is a callback class that has hooks for opening and closing a device, which we don’t need. The important hook is the one that sends the bytes to the device. This is where you can capture the raw stream. Most audio analysis, however, doesn’t use the raw stream, but averages the data to reduce the amount of data to process.

So that is how I ended up with my layered architecture:

  1. JLayer itself decodes the mp3 file and offers the byte stream to …
  2. … the audio device. This layer does only very basic processing. In the current example project you’ll find a device that averages the samples. I could add Fast Fourier Transform in the future.
  3. The final layer are the actual brains. I’ve called them “processor”. This is where the magic will happen.

That’s all there is to the current version of my project. For now it’s just a zip file, but if there is interest, I might put it on some public VCS.

The main advantage of the current architecture are:

  • There’s only one dependency to outside projects: JLayer. This has advantages for portability.
  • Splitting the basic processing that every analysis project needs anyway into its own layer, frees the actual processing from doing actual processing.

downloadDownload the NetBeans example project

(Image credit)

Be Sociable, Share!
This entry was posted in Java and JavaScript. Bookmark the permalink. Both comments and trackbacks are currently closed.

2 Comments

  1. Posted April 5, 2011 at 4:30 pm | Permalink

    Hi there Peter!

    Nice work you’ve done here with pointing out how you can extract the samples without needing to play the audio file. I was searching for ways of doing that and I ended up here and I’m grateful for your example.

    You said there “I could add Fast Fourier Transform in the future.” and I was curios how would you do that, because, as far as I researched, the FFT algorithm (or at least the ones I found on the internet so far) is limited to power-of-two array of samples sizes… so… I need your advice on how should I crack this down in order to do it the right way. I could also use a buffer of power-of-two size and perform a FFT on that buffer.

    Thanks for your help,
    have a nice day!

  2. Posted April 6, 2011 at 6:07 am | Permalink

    I never continued the project, so I’m not sure. If this is a problem, it’ll probably suffice to make sure there’s a read buffer of the correct size. (disclaimer: It has been a long time since I did any kind of FFT, so the details are a little vague right now)

  • Feedback or questions?

    Due to excessive (and I do mean excessive!) spam, comments and the feedback forms have been disabled. I'm very sorry for the inconvenience, but you can still contact me through Facebook and Twitter.