Are ware talking repeated recorded messages or just recordings of people saying the same thing? In other words, are you looking for something that matches bad recordings of the same source data or are you trying to do natural spoken language recognition?
I'm not an expert on either, but I'd probably take a stab at problem A (matching noisy recordings) by first applying a low-pass filter to get rid of most of the noise, then downsample to some really low sample-rate, then find the peaks in the recording and see if the timing of the peaks matches any of the pre-determined messages.
That would probably only work if you have a fairly limited number of messages, but at least it's reasonably easy to implement using standard command-line driven audio tools for the conversions and then using something like Audio::SndFile (disclaimer: I wrote it) to parse the data and find the peaks.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.