Wow, I got some of the facts mixed up. So much for hearing things in the grapevine and transmitting it down from a very boggled memory
I just emailed Dr. Toole.
In actuality, they did their DBT testing in controlled listening rooms, not 4pi rooms which make sense since 4pi would measure well but sound horrible.
In addition, it was actually musicians who were the worst at picking out differences. He claims many of them unfortunately suffer hearing loss. Their selected trained listeners were the best at discerning differences while experienced reviewers (though the sample size was small) were no better than average MP3-era college students at discerning sonic differences. The test was conducted with a panel of 280 listeners lead by Sean Olive.
If you want to see the full report, you can get a copy of it from AES 114th Convention March 22-25, 2003.