I’m working on a project for school and I could use some advice. I’m an electrical engineering background, but unfortunately, when it comes to audio engineering, I’m a bit in the dark.
Basically the scope of my project is to determine the effect of microphone quality on speech recognition. I’ve got a speech recognition engine already setup and I want to:
a) Reproduce a person’s voice on a speaker/set of speakers (from a 1 channel, PCM wav file recording)
b) Model background noise (a noisy street, background office noise, etc…) using a speaker/set of speakers (also played from a pre-recoded, 1 channel, wav file)
c) Capture that voice using one or several microphones of varying quality and store the result digitally.
d) Do an analysis, using the speech engine, to see if varying levels of microphone quality have any impact on the recognition of words/phrases.
My questions:
1) Can speakers accurately model human voice? If yes, what kind of speakers would I need? Would I be better off using an array of speakers or a single speaker?
2) When modeling background noise, could I use the same speakers I used to reproduce the voice or should I use a different setup?
3) If it is possible to generalize, what effect do cheaper microphones have on voice recording.
Assume I have a budget on the order of a couple of hundred bucks(~500)
Any help you guys could offer would be appreciated!