Acoustic memory is a very interesting thing. The average person has a memory of about 3-4 seconds from what I have gathered from my reading, but one can train to increase this threshold (the brain is an amazing thing
).
To expand on what mudrammer99 said there are a few reasons those who are knowledgeable recommend certain methods of audition over others.
First, many recommend using well know media this is because as previously stated we assign specific attributes to certain situations so we "know" how the material is supposed to sound. This method is the simplest and most achievable by the average person - just rely on what you know and make a good match.
The next step would be using an A/B blinded, if possible, comparison [using one's own room is ideal due to elimination of hugely prevalent room effects]. In such a method level matching should be taken into account due to various research showing that loudness correlates to what is perceived as sounding better. Using this method adds more control to the situation as it allows for a possible full ABx situation where you blindly choose which you like better.
The method I prefer [and am starting to do] is taking a pair of headphones with me and using them as a reference. I have picked these headphones due to their responses linearity as well as near complete lack of resonance. By SPL matching by ear (quick and dirty way of doing things) I can surmise the actual response of the speaker as well as coloration due to the cabinet.
As you can see all of these situations require the brain to use its auditory memory, but not directly. Instead we rely on the brain to do what it is best at, storing information, the hard part is learning the proper way to recall the information properly as such training might be required.
I do not have any of the papers I have read on hand, but I will try to look them up and post a few links later today. A good cognitive psychology book is always a great resource.
Edit: Basically, from what I have learned in my classes and read when we are listening to music especially critically we are using our working memory. During this working stage our brain is actively decoding the information sent to it and storing it focusing on specific details as directed [by ourselves of course]. The retrieval process is the hardest part for us to do correctly as often times after the fact we have already started to add our own distortions/preferences to the mix which is where our memories become inaccurate [this is why DBTs are so important to credible research]. Part of the reason auditory memory is so short is the way our brain is wired. When we are storing information it is rare that everything is stored, only the parts we queue ourselves to store are stored in close to complete actuality the rest is lost - this is why in general auditory memory is so short.