A shame this useful article has such a schizophrenic attitude to DBTs (lots of anti-DBT snark, ending with 'but actually, DBTs are the way to go'). Really, why not just run these 'mythbustings' sections about DBTs before Drs Toole or Olive first? Here's a few points I thought were overwrought:
No, DBTs are not the 'single most abused term in audio' -- from my experience, hardly *any* manufacturers even claim to use them in the first place! Feel free to prove me wrong -- list all the ones *you* have in mind.
No, 'the argument' is NOT that the prettier/more famously branded loudspeaker will *ALWAYS* be preferred. It's simply a common bias that needs to be accounted for in sensory and product preference testing. Which, btw are sciences too; DBTs are NOT just the tool of medical research. There are textbooks devoted to sensory testing methods. They cover DBTs. You should
look them up
No one I know of uses ABX tests for *loudspeakers* or for tests of preference generally. I can't recall seeing it recommended, either. There are many kinds of DBTs, suited for different purposes. ABX is recommended for testing claims of *difference*. So why even bring it up?
Instantaneous switching isn't confusing. What is more confusing for audio memory, is having a lag (NON-instantaneous switching) between stimuli. Minimizing the switching interval -- the time between the end of A and start of B -- is a *good thing* for increasing listener discrimination of small differences. Not as *big* an issue if differences are bigger, as tends to be the case for loudspeakers.
"Switching interval" and "length of musical sample (i.e. length of a trial)" are two different issues that appear to be confused in this article.
Obviously you can't level match loudspeakers the way you can amps or cables or CD players. No one claims you can, and I don't see 0.1 dB level matching being demanded of loudspeaker comparison anywhere.
Extended listening isn't a necessity, has its own drawbacks (noted in the article), and is arguably inferior to *trained* listening....a purpose of training is so reliable results can be obtained more quickly, because listeners can more quickly identify and articulate whatever differences are there. And IMO the 'pressure' aspect of a listening test is overrated -- espeically when you aren't dealing with small differences. If loudspeaker listeners feel they need extended listening to form their 'true' preference, fine, nothing intrinsic to double-blind testing prevents that. The important thing (from an accuracy standpoint) is to test, in the end, if the preference is likely due to the *sound alone*, and that means a test that *minimize biases from non-audible factors*. If the length of a blind test seems to short, I'd suggest you do your weeks of sighted comparative listening, form your preference -- *then* see if it holds up when you can't 'see' (literally or figuratively) the speakers. Should be a snap!
I'd like to see you publish 'your' listening comparison results formally, with detailed methods and tables, so I could rack them up against, say, Olive's and Toole's work on loudspeaker preference and performance, in terms of scientific rigor.