Rip Van Woofer said:
I've mentioned these papers before: Go to the AES Website and and find the David Clark papers on ABX testing. Spend the five bucks for each paper (they're called "preprints"). Ten years' of tests should be a large enough sample, hmm?
And a word about positives: think back to your basic stats and probability class if you took one (I didn't but, willy-nilly, did get some of the fundamentals elsewhere). Any random phenomenon (like a coinflip, or sheer guesswork in any kind of a/b test) will sometimes generate seemingly non-random behavior (like a run of "heads") - a positive*. Similarly, even a well-conducted double-blind study that ends in a null result will sometimes generate an anomalous positive (or even a few) among a large number of nulls. This might give the appearance (in an audio test) of the presence of a person (or even a few persons in a large test) with "golden ears".
The presence of a small number of positives vs. a large number of nulls can be very intriguing and may suggest the need for further examination. If poor methodology (including experimenter bias, group pressure, etc.) is ruled out as a cause for the anomalous positives, one simple test is to run the test again with those reporting positive results and see if they repeat.
If a sufficent number of trials has been run, there might be a point where a small number of anomalous results can be confidently "written off" but I don't know the number (x percent?). I suspect our old friend the bell curve applies here somewhere.
I believe that sixteen to twenty trials per subject (person) is generally considered statistically sound, BTW.
*And that is how gambling works, too. The occasional "lucky" hit in a random sequence of events is enough to reinforce the behavior and keep the sucker coming back for more. There's a term for it but my Psych 101 class was long ago. Same mechanism can convince a person that there is "something there" in a string of random results (this amp sounds better than the others, the chicken sacrifices finally ended your bad luck in love) when there isn't. We humans survive and thrive in part by recognizing patterns. Trouble is, sometimes we see them where they ain't.
('Scuse me...gotta find a chicken and a sharp knife before I send out these resumes...)
(Polkfan: great sig re: 'philes vs. music lovers!)
[EDIT] The "Masters on Audio" article is superb, and I hadn't seen it before. I just added a link to it on my Webpage. Hard to imagine a more succinct presentation of the facts. It and other lengthier and more technically oriented articles such as the Clark papers I mentioned and Doug Self's "Science and Subjectivism in Audio" (link to it from my "Audio Wisdom" Webpage, see signature below) should be more than sufficient for any rational person.
To make it easy on the ambitious ones here, some of these cotations:
"Ten Years of A/B/X Testing", Clark, David L., Presented at the 91st AES Convebntion, Oct 91, Print #3167.
"High-Resolution Subjective Testing Using a Double-Blind Comparator", Clark, David, Journal of the Audio Engineering Society, Vol30, no 5, May82, pg 330-338.
"The Great Ego Crunchers: Equalized, Double Blind Testing", Shanefield, Daniel, Hi-Fidelity, Mar 80, pg 57-61.
This is about very long term listeing, several months, under DBT
"Topological Analysis of Consumer Audio Electronics: Another Approach to Show that Modern Audio Electronics are Acoustically Transparent", Rich, David and Aczel, Peter, 99 AES Convention, 1995, Print #4053.
"The Great Debate: Is Anyone Winning?", Nousaine, Tom, Proceedings of the AES, 8th International Conference, 1990, page 117-120.