Should Speakers be Designed to Have a Flat SPL?

T

TheStalker

Banned
We've been having a slight discussion regarding this topic in the subwoofer thread and it might be a little more fun and on topic to discuss it further here. I believe that speakers should not have a flat SPL, but follow more closely to how an average human ear hears. Flat speakers to me sound hard, bright, and unnatural. I find B&W speakers to sound very pleasing tonally and if any measurements are to be looked at, they follow the equal loudness scale very accurately. B&W being a multi-million dollar company who owns two anechoic chambers and uses computer software to model crossovers could build a flat SPL speaker in a matter of minutes.

If a recording was made with a flat calibrated microphone, then it should be played back on a non flat speaker system in order to sound flat to an average ear. Playing the recording back on a flat system will actually make it sound incorrect, because at that point one hears what the microphone did and not what a person would have during the event.

Here is the equal loudness chart:



This is what Linkwitz had to say:

Electro-acoustic models

"H - Psycho-acoustic 3 kHz dip

Our perception of loudness is slightly different for sounds arriving frontally versus sounds arriving from random directions at our ears. The difference between equal-loudness-level contours in frontal free-fields and diffuse sound fields is documented, for example, in ISO Recommendation 454 and in E. Zwicker, H. Fastl, Psycho-acoustics, p. 205.
Diffuse field equalization of dummy-head recordings is discussed in J. Blauert, Spatial Hearing, pp. 363, and headphone diffuse field equalization by G. Theile in JAES, Vol. 34, No. 12.
Reference to a slight dip in the 1 to 3 kHz region for loudspeaker equalization is made in H. D. Harwood (BBC Research Department), Some factors in loudspeaker quality, Wireless World, May 1976, p.48.

Around 3 kHz our hearing is less sensitive to diffuse fields. Recording microphones, though, are usually flat in frequency response even under diffuse field conditions. When such recordings are played back over loudspeakers, there is more energy in the 3 kHz region than we would have perceived if present at the recording venue and a degree of unnaturalness is introduced.
This applies primarily to recordings of large orchestral pieces in concert halls where the microphones are much closer to the instruments than any listener. At most listening positions in the hall the sound field has strong diffuse components.
I use a dip of 4 dB (x1.gif, 2760NF) to equalize for this. The circuit consists of R, C and L in series, forming a frequency dependent ladder attenuator in conjunction with the 5.11k ohm source resistor. You may choose to make the notch filter selectable with a switch for different types of recordings.

I have found through my own head-related recordings of symphonic music that the dip adds greater realism, especially to large chorus and to soprano voice and allows for higher playback levels. "

And this is what BBC determined after doing a $1 million (nearly $5 million in today's dollar) psychoacoustic research some 40 years ago:

 
C

Chu Gai

Audioholic Samurai
You can try and mimic such a curve by using equalization but I doubt you'd find the results pleasing or natural. There have been studies that support a somewhat tilted down curve at the listening position which I believe Toole references in his book. To some extent that is naturally achieved since as frequencies go higher, they're attenuated more by absorption and atmospheric effects.
 
TLS Guy

TLS Guy

Seriously, I have no life.
We've been having a slight discussion regarding this topic in the subwoofer thread and it might be a little more fun and on topic to discuss it further here. I believe that speakers should not have a flat SPL, but follow more closely to how an average human ear hears. Flat speakers to me sound hard, bright, and unnatural. I find B&W speakers to sound very pleasing tonally and if any measurements are to be looked at, they follow the equal loudness scale very accurately. B&W being a multi-million dollar company who owns two anechoic chambers and uses computer software to model crossovers could build a flat SPL speaker in a matter of minutes.

If a recording was made with a flat calibrated microphone, then it should be played back on a non flat speaker system in order to sound flat to an average ear. Playing the recording back on a flat system will actually make it sound incorrect, because at that point one hears what the microphone did and not what a person would have during the event.

Here is the equal loudness chart:



This is what Linkwitz had to say:

Electro-acoustic models

"H - Psycho-acoustic 3 kHz dip

Our perception of loudness is slightly different for sounds arriving frontally versus sounds arriving from random directions at our ears. The difference between equal-loudness-level contours in frontal free-fields and diffuse sound fields is documented, for example, in ISO Recommendation 454 and in E. Zwicker, H. Fastl, Psycho-acoustics, p. 205.
Diffuse field equalization of dummy-head recordings is discussed in J. Blauert, Spatial Hearing, pp. 363, and headphone diffuse field equalization by G. Theile in JAES, Vol. 34, No. 12.
Reference to a slight dip in the 1 to 3 kHz region for loudspeaker equalization is made in H. D. Harwood (BBC Research Department), Some factors in loudspeaker quality, Wireless World, May 1976, p.48.

Around 3 kHz our hearing is less sensitive to diffuse fields. Recording microphones, though, are usually flat in frequency response even under diffuse field conditions. When such recordings are played back over loudspeakers, there is more energy in the 3 kHz region than we would have perceived if present at the recording venue and a degree of unnaturalness is introduced.
This applies primarily to recordings of large orchestral pieces in concert halls where the microphones are much closer to the instruments than any listener. At most listening positions in the hall the sound field has strong diffuse components.
I use a dip of 4 dB (x1.gif, 2760NF) to equalize for this. The circuit consists of R, C and L in series, forming a frequency dependent ladder attenuator in conjunction with the 5.11k ohm source resistor. You may choose to make the notch filter selectable with a switch for different types of recordings.

I have found through my own head-related recordings of symphonic music that the dip adds greater realism, especially to large chorus and to soprano voice and allows for higher playback levels. "

And this is what BBC determined after doing a $1 million (nearly $5 million in today's dollar) psychoacoustic research some 40 years ago:

First of all you can't evaluate speakers with dummy head recordings, only headphones. Dummy headphone recordings via speakers sound universally awful.

A speaker that followed the equal loudness chart would sound very bass heavy. In any event the relevance of the equal loudness chart is highly questionable. I hate it when any type of loudness control based on those curves is engaged.

The BBC Smiley as it is known, does add an illusion of stage depth. However in the HT environment it really detracts from speech quality and discrimination.

I think the BBC Smiley can now be avoided with good design. I think its use is mainly due to poor crossover transitions in that region an lobing error problems.

A speaker system for music and HT is a formidable affair. Everything has to be as flat as possible with good off axis response. I will say that even a small elevation in the 2 to 4 kHz zone makes for an unpleasant speaker. In my center speaker I had to go to extreme lengths to smooth the response to get totally natural speech, which included chasing a small 3 db dip in response at 9 kHz.

A lot goes into making a really good speaker and there is no justification for deliberate response errors.
 
D

Dennis Murphy

Audioholic General
Hmmmm I'll have to think about this. But isn't there a logical fallacy in the argument? The music that comes from the orchestra is unaltered. It is what it is. And that's what the mic captures. When people hear that sound live, they aren't equally sensitive to all of the frequencies. But they won't be when listening to loudspeakers either. So if you contour the speaker to play what the listener would have heard at the live concert, I think the ear in the living room will then cause a further contouring--there will be double contouring. I do agree, however, that many recordings use too many spot mics, and the result can be an unnaturally edgy sound. I usually end up with a little dip in the lower treble--just a little one. If you monkey around too much in that area, the playback is just going to lose realism.
 
T

TheStalker

Banned
Hmmmm I'll have to think about this. But isn't there a logical fallacy in the argument? The music that comes from the orchestra is unaltered. It is what it is. And that's what the mic captures. When people hear that sound live, they aren't equally sensitive to all of the frequencies. But they won't be when listening to loudspeakers either. So if you contour the speaker to play what the listener would have heard at the live concert, I think the ear in the living room will then cause a further contouring--there will be double contouring. I do agree, however, that many recordings use too many spot mics, and the result can be an unnaturally edgy sound. I usually end up with a little dip in the lower treble--just a little one. If you monkey around too much in that area, the playback is just going to lose realism.
But that doubling would occur only if the mic has the same curve as a human ear. Most mics record very flat and therefore they would record something entirely different than what the ear would. Playing this back on a flat speaker system, you're essentially listening to what the mic heard and not what the ear would have heard. Some EQ has to be done at that point if realism is to be achieved. I think Linkwitz has it right and so does BBC.

I think where I disagree with you is that you assume that a speaker will project the sound the exact same way as a live orchestra would. I don't think that to be the case. A flat mic and a flat loudspeaker will sound different than a live event.

I just want to clarify that by no means the equal loudness should be followed exactly, because the dips, peaks, and curves are rather extreme. But a more gentler approach, let's say 2-3dB down at 3-4kHz, 1-2dB up at 8-10kHz, and 1-2dB down at 500-700Hz would be a good place to start to get a very balanced and ear pleasing tone.

I also tend to agree with Linkwitz and BBC by simply listening to many different speakers. A dip in the 3-4kHz region does not make a loudspeaker sound broken, instead it makes it sound very pleasing. Now a peak in that region makes a speaker at least in my opinion un-listenable.
 
D

Dennis Murphy

Audioholic General
You're making two different arguments. One I think is just wrong, the other may or may not have some merit. Let's assume for a moment that the ear does hear loudspeakers the same way that it hears a live orchestra. I'm just trying to hold one variable constant so we can examine the marginal impact of the other variable. Then, if the ear wants to hear what it would hear at the live event, the speaker must have a flat response. It must reproduce the sounds that actually came out of the orchestra without any contouring, which is what the calibrated mic will hear. Otherwise there will be double contouring. Now, does the ear hear loudspeakers the same way it hears live orchestras? Probably not, but I don't know where that leads us. I don't know what
the differences are. So I think we're back to adjusting speaker response a little for what are very common bad practices in mic mixing with most modern recordings. There's a reason the old RCA Shady Dog recordings of the Chicago sound so good. There were no spot mics.
 
C

Chu Gai

Audioholic Samurai
This is what Harbeth, a British speaker manufacturer, has to say about the BBC dip.

[h=2]I've heard mention of 'the BBC dip' or 'the Gundry dip'. What does that mean?[/h]There is much myth, folklore and misunderstanding about this subject.

The 'BBC dip' is (was) a shallow shelf-down in the acoustic output of some BBC-designed speaker system of the 1960s-1980s in the 1kHz to 4kHz region. The LS3/5a does not have this effect, neither in the 15 ohm nor 11 ohm, both of which are in fact slightly lifted in that region.

According to Harbeth's founder, who worked at the BBC during the time that this psychoacoustic effect was being explored, the primary benefit this little dip gave was in masking of defects in the early plastic cone drive units available in the 1960's. A spin-off benefit was that it appeared to move the sound stage backwards away from the studio manager who was sitting rather closer to the speakers in the cramped control room than he would ideally wish for. (See also Designer's Notebook Chapter 7). The depth of this depression was set by 'over-equalisation' in the crossover by about 3dB or so, which is an extreme amount for general home listening. We have never applied this selective dip but have taken care to carefully contour the response right across the frequency spectrum for a correctly balanced sound. Although as numbers, 1kHz and 4kHz sound almost adjacent in an audio spectrum of 20Hz to 20kHz, the way we perceive energy changes at 1kHz or 4kHz has a very different psychoacoustic effect: lifting the 1kHz region adds presence (this is used to good effect in the LS3/5a) to the sound, but the 4kHz region adds 'bite' - a cutting incisiveness which if over-done is very unpleasant and irritating.

You can explore this effect for yourselves by routing your audio signal through a graphic equaliser and applying a mild cut in the approx. 1kHz to 4kHz region and a gradual return to flat either side of that.
 
J

JonnyFive23517

Audioholic
But that doubling would occur only if the mic has the same curve as a human ear. Most mics record very flat and therefore they would record something entirely different than what the ear would. Playing this back on a flat speaker system, you're essentially listening to what the mic heard and not what the ear would have heard.
You want to listen to what a flat mic heard, because at the final step of going into your ear, it becomes altered. If your ear hears what an ear heard, the sound has been distorted twice.

A simplified, idealized example:

For live music: Orchestra sound --> Ear (becomes altered by our perception). In reproduction: Orchestra sound --> Flat mic (no altering) --> Flat speaker (no altering) --> Ear (becomes altered by our perception). Imagine that violin's sound wave being captured, put in a bottle, then listened to at a later date. You'd want it coming out of that bottle as close to the original as it could possibly be (measured by an objective source, a flat mic). Introducing an altering at the reproduction level will result in a double-altering when it goes into your ear.
 
GranteedEV

GranteedEV

Audioholic Ninja
I recently had a pair of svs ultra towers in my house. I found that the worse the recording was, the more they shined. They made some music sound incredible, because you could turn the volume up without vocals being overly prominent.

Yet as soon as you put on good recordings, their timbral colorations made them sound dull and uninspiring. Ultimately, while they were good, i did not consider their presentation to be realistic.

Measurements revealed a BBC dip in the forward response.

My philharmonic speakers, which as far as I know are ruler flat with excellent off-axis response to boot, still sound good with said poor recordings. You do however wish the mixer had quality playback gear and didn't mix with in-car listening in mind, which is typical of contemporary stuff.

For dialovue, soundtracks, and anything else, the difference was strong enough that i at no point would even contemplate selling the Philharmonics. I am afraid they might be here for life.

a bbc dip makes sense for when the mixer did not have in-home listening in mind. but i just don't love it. It is not high fidelity, which is what i want at home in my living room.
 
Last edited:
slipperybidness

slipperybidness

Audioholic Warlord
Hmmmm I'll have to think about this. But isn't there a logical fallacy in the argument? The music that comes from the orchestra is unaltered. It is what it is. And that's what the mic captures. When people hear that sound live, they aren't equally sensitive to all of the frequencies. But they won't be when listening to loudspeakers either. So if you contour the speaker to play what the listener would have heard at the live concert, I think the ear in the living room will then cause a further contouring--there will be double contouring. I do agree, however, that many recordings use too many spot mics, and the result can be an unnaturally edgy sound. I usually end up with a little dip in the lower treble--just a little one. If you monkey around too much in that area, the playback is just going to lose realism.
Yeah, that was my thinking too, "double-contour".

If this argument of non-flat response were to carry any merit (which I don't believe that it does), then I would think the argument should be the speaker response is the INVERSE of that perception curve.
 
T

TheStalker

Banned
I'm sorry, but the double contour just does not occur in what I proposed. Linkwitz and BBC tend to agree.
 
D

Dennis Murphy

Audioholic General
I'm sorry, but the double contour just does not occur in what I proposed. Linkwitz and BBC tend to agree.
That's an assertion, not an explanation. The "BBC" dip is actually an urban audio legend, as Jeff Bagby has documented. The little Rogers speaker did have a hump in the midbass
to make up for the lack of bass further down, but that was about it. I think Linkwitz is talking about a different issue, which is how to deal with the artificially "up front" sound that you
get with so many modern multi-mic recordings of orchestras. People can differ on whether you should deal with that by recessing the lower treble on loudspeakers, but it's not a pure argument about whether it's just a necessity to contour speaker response because of how we hear. Linkwitz sent me a recording he had made of the San Francisco Orchestra that he had made with two tiny mics mounted in his eyeglasses. I think he was in about the 12th row, and that's what the mic's captured. The sound on my system is extremely natural and realistic. There is absolutely no need for contouring, and it wouldn't be logical to implement any. The ear will already have contoured the sound. It doesn't need any help from the loudspeaker to recreate what the ear would have heard in the 12 row at the SFO hall.
 
T

TheStalker

Banned
That's an assertion, not an explanation. The "BBC" dip is actually an urban audio legend, as Jeff Bagby has documented. The little Rogers speaker did have a hump in the midbass
to make up for the lack of bass further down, but that was about it. I think Linkwitz is talking about a different issue, which is how to deal with the artificially "up front" sound that you
get with so many modern multi-mic recordings of orchestras. People can differ on whether you should deal with that by recessing the lower treble on loudspeakers, but it's not a pure argument about whether it's just a necessity to contour speaker response because of how we hear. Linkwitz sent me a recording he had made of the San Francisco Orchestra that he had made with two tiny mics mounted in his eyeglasses. I think he was in about the 12th row, and that's what the mic's captured. The sound on my system is extremely natural and realistic. There is absolutely no need for contouring, and it wouldn't be logical to implement any. The ear will already have contoured the sound. It doesn't need any help from the loudspeaker to recreate what the ear would have heard in the 12 row at the SFO hall.
But I did provide an explanation and even included multiple sources. One from the Godfather of speaker design himself. I'm just not sure how else to explain this. You guys keep talking about double contouring, which is completely unrelated and would never occur. You believe that a flat speaker will play an orchestra in exactly the same way as it would have been heard live and then the ears will use the equal loudness curve to compensate and the end result will be exactly what one would have heard at the live orchestra. And I must strongly disagree with that. Since the speakers are more directional, they will play more accurately what the flat calibrated mic heard and NOT what the ears would have heard. This doubling of contour would not happen and there would be excess energy present in the 3-4kHz region. This must be compensated somewhere in the system, at the speakers, with an equalizer, with the original sound mix, etc. Otherwise the system will sound unnatural and hard/bright.
 
ski2xblack

ski2xblack

Audioholic Field Marshall
Flat speaker, contoured speaker, I'm not sure it matters. Either still deviates wildly from whatever it's design is as soon as you put it in a room, thus each would need equalizing to get back to flat (or the original contour, whatever the goal may be). Everyone's got their own happy curve and their listening rooms have their own quirks, so it seems to me that keeping things as linear as possible up to that point, and only then applying the make-up per the listeners wants and preferences, makes the most sense. I don't want some recording engineer or speaker designer applying permanent "fixes" upstream that may not apply to my situation.
 
zhimbo

zhimbo

Audioholic General
" Most mics record very flat and therefore they would record something entirely different than what the ear would." Mics don't hear, they record. If they record flat, they are recording exactly what was produced. Then, when reproduced accurately, the ear will hear the original signal in the way the ears normally would. If the recording is perfectly accurate to the signal, it "disappears" in the process. There may well be differences between directional and diffuse sound, but the problem isn't "recording flat". The perceptual equal loudness contour isn't the issue - or, at least, not in the sense that compensating for the contour is in any way a reasonable solution to any problem. As far as I can tell, the original Linkwitz article only applies specifically to reproducing concert hall sound, and in no way says that the perceptual equal loudness curve needs to be "corrected" or anything like that by the speaker.
 
Last edited:
Irvrobinson

Irvrobinson

Audioholic Spartan
I've made several recordings of acoustic instruments in my own home, I don't use any equalization at all, and they sound pretty realistic. If your theory is correct, why do my recordings sound realistic? As I mentioned in another thread, a while back I had a person playing a flute along with a recording of herself playing flute, while she was standing between my speakers. Why did the live and recorded performances sound so similar?
 
3db

3db

Audioholic Slumlord
Flat speaker, contoured speaker, I'm not sure it matters. Either still deviates wildly from whatever it's design is as soon as you put it in a room, thus each would need equalizing to get back to flat (or the original contour, whatever the goal may be).]Everyone's got their own happy curve and their listening rooms have their own quirks, so it seems to me that keeping things as linear as possible up to that point, and only then applying the make-up per the listeners wants and preferences, makes the most sense. I don't want some recording engineer or speaker designer applying permanent "fixes" upstream that may not apply to my situation.
+1 keep things as linear as possible before in room anomalies muck things uyp.
 
T

TheStalker

Banned
" Most mics record very flat and therefore they would record something entirely different than what the ear would." Mics don't hear, they record. If they record flat, they are recording exactly what was produced. Then, when reproduced accurately, the ear will hear the original signal in the way the ears normally would. If the recording is perfectly accurate to the signal, it "disappears" in the process. There may well be differences between directional and diffuse sound, but the problem isn't "recording flat". The perceptual equal loudness contour isn't the issue - or, at least, not in the sense that compensating for the contour is in any way a reasonable solution to any problem. As far as I can tell, the original Linkwitz article only applies specifically to reproducing concert hall sound, and in no way says that the perceptual equal loudness curve needs to be "corrected" or anything like that by the speaker.
Well he corrects it with an EQ. I don't see a difference there. And the reason why Linkwitz compares everything to a live orchestra is because there's something to actually compare to. If you take electronic music, then who knows what's right.

I also wish that we could get past the fact that the flat recording is now being played back on loudpseakers and not by a live orchestra anymore. There's a huge difference there. I'm surprised that everyone thinks that flat loudspeakers and an entire orchestra would still produce and project the sound in exactly the same way. I just see this as a fundamentally flawed idea.
 
Swerd

Swerd

Audioholic Warlord
But I did provide an explanation and even included multiple sources. One from the Godfather of speaker design himself. I'm just not sure how else to explain this. You guys keep talking about double contouring, which is completely unrelated and would never occur. You believe that a flat speaker will play an orchestra in exactly the same way as it would have been heard live and then the ears will use the equal loudness curve to compensate and the end result will be exactly what one would have heard at the live orchestra. And I must strongly disagree with that. Since the speakers are more directional, they will play more accurately what the flat calibrated mic heard and NOT what the ears would have heard. This doubling of contour would not happen and there would be excess energy present in the 3-4kHz region. This must be compensated somewhere in the system, at the speakers, with an equalizer, with the original sound mix, etc. Otherwise the system will sound unnatural and hard/bright.
In your 1[SUP]st[/SUP] post you said:
“If a recording was made with a flat calibrated microphone, then it should be played back on a non flat speaker system in order to sound flat to an average ear. Playing the recording back on a flat system will actually make it sound incorrect, because at that point one hears what the microphone did and not what a person would have during the event.”

The sound from an orchestra is not contoured. Only the human ear/brain applies the contouring shown in the Fletcher-Munson curves. If a speaker plays flat, the contour curves are applied once by the ear/brain. If a speaker plays non-flat, i.e. contoured, the sound will be double contoured by the time it is perceived by the listener.

This where the idea of double contouring came from.

As Dennis pointed out, the quotes you added from Linkwitz (in your 1st post) discuss the disadvantage of closely placed microphones when recording orchestras in large concert halls, “where the microphones are much closer to the instruments than any listener”. He did not refer directly to the so-called BBC dip, which you went on to discus next.

I understand if you prefer listening to speakers with a midrange dip, but it is quite different to argue that this non-flat speaker response is the correct way to design speakers. I don't think this is what Linkwitz said.
 
T

TheStalker

Banned
In your 1[SUP]st[/SUP] post you said:
“If a recording was made with a flat calibrated microphone, then it should be played back on a non flat speaker system in order to sound flat to an average ear. Playing the recording back on a flat system will actually make it sound incorrect, because at that point one hears what the microphone did and not what a person would have during the event.”

The sound from an orchestra is not contoured. Only the human ear/brain applies the contouring shown in the Fletcher-Munson curves. If a speaker plays flat, the contour curves are applied once by the ear/brain. If a speaker plays non-flat, i.e. contoured, the sound will be double contoured by the time it is perceived by the listener.

This where the idea of double contouring came from.

As Dennis pointed out, the quotes you added from Linkwitz (in your 1st post) discuss the disadvantage of closely placed microphones when recording orchestras in large concert halls, “where the microphones are much closer to the instruments than any listener”. He did not refer directly to the so-called BBC dip, which you went on to discus next.

I understand if you prefer listening to speakers with a midrange dip, but it is quite different to argue that this non-flat speaker response is the correct way to design speakers. I don't think this is what Linkwitz said.
But you just made the assumption that a flat loudspeaker and an orchestra are one and the same entity and they project the sound in the exact same way. That's a very huge assumption and a rather flawed one. That's what I'm trying to explain. Mics are more directional, speakers are more directional, they record nothing like what you would have heard at the live event.
 
newsletter

  • RBHsound.com
  • BlueJeansCable.com
  • SVS Sound Subwoofers
  • Experience the Martin Logan Montis
Top