Hey, don't knock the harpsicord, bub!

I actually do listen to that instrument a lot (even if I can't spell it), and I have several hundred classical CDs. I'd say piano & cello are also stern tests for a codec. Again, I have no experience with P2P stuff- all the MP3 I've used has been ripped directly from RBCDs in my own possession. The only stuff I've been very pleased with has been 320 kbps & the highest rate VBR. But then, I'm no expert on MP3, and I've only tried Nero, Music Match Jukebox, & EAC. I did try the DMX plugins, but I didn't really detect any difference. Probably there's better encoders. As I said, my interest in MP3 was short lived, and ultimately I just don't really need it.
BTW, one use for MP3 is to put many hours of music on a disc, but I don't see that as desireable. Again, with no good way to change the order of the songs from alphabetical, 6 hours is just not necessary. I did make some MP3 discs where I changed the song names to incorporate a numerical prefix so they were in a specific order of my choosing. But man, that's a lot of work- not practical on a daily basis.
I know of no tests of or studies of listener fatigue. As you allude to, it would be a complex thing to attempt to subject to DBT. ABX style testing is great for determining if there's a JND between two things, but it's problematic for trying to diagnos an effect that may take hours to experience. To set up a test you'd have to know what you were looking for.
It might be a bit like listening to your mother in law.

A short period of time doesn't seem so bad, but an hour or so and you're ready to puncture your eardrums with an ice pick.