in reply to Audio input and processing - recommendations

After reading Corion's answer I thought that it might be faster to let the command line do the FFT and pass that data to perl. I modified Corion's command-line to pipe into sox which can do basic FFT. I have also modified the ffmpeg options to output in sox format -f sox from reading my soundcard #0 -i hw:0 with alsa, thusly:

ffmpeg -hide_banner -loglevel error -nostats -f alsa -i hw:0 -t 30 -ac + 1 -ar 44100 -f sox - | sox -t sox - -n stat -freq

Also, see this: https://www.linuxquestions.org/questions/linux-software-2/why-does-sox-stat-freq-give-me-different-data-multiple-times-927589/ for example.

Alternatively, you can use Corion's code to read raw audio bytes and plug them into PDL::FFTW3 to get FFT with a state-of-the-art C library (FFTW3).

Following the latter route, you obviously have more control on what processing to do, albeit slower (hmm benchmarks?).

The other route is to replace sox with custom C code (based on FFTW3 library) to do the FFT and any other processing you want closer to the hardware.

Finally, there are lots of other command-line-based software in Linux to build a processing pipeline for your needs.

Last but not least: PureData (PD) https://puredata.info/

bw, bliako

Replies are listed 'Best First'.
Re^2: Audio input and processing - recommendations
by haj (Vicar) on Nov 22, 2023 at 12:36 UTC

    I have not tried audio input (yet), but found the combination of PDL and SoX quite powerful to synthesize and play sound, including spectral analysis. Here's a screenshot of one of my experiments (sine waves and playing with overtones).

    I use PDL to create the raw audio data and SoX to pipe them to whatever sound system is available (means: also works on MS-Windows), the waveform and spectrum display happen in real-time.

    I plan to write an article about stuff like this for quite some time now, but there are so many distractions...

      A little later: haj did write at least one article about PDL and sound-processing - I'm listening as I write this to the very nice bit of music he has at the end.

      For audio input, I'd note that a physical limitation of doing DFT on a finite, short window of discrete input is that it creates artifacts from cut-off. This is mitigated in various ways, and the easiest way in PDL-land is to use PDL::DSP::Windows. See jjatria's Advent article for more.

      More generally for real-time-ish stuff, PDL doesn't yet have a very fully-tested real-time capability. I still intend to experiment more fully, but one approach might be to set up a "flowing" transformation (so you pay the setup cost only once), then keep updating the input sample then reading the processed output. If anyone does have a go at that, I'd love to hear your findings!