This is an interesting but fairly pointless discussion. We're throwing around four different terms resulting in four different kinds of measurements, sometimes in the same post and sentence.
There's "Peak", "Average", then "dbA", and "dbC". That produces these combinations, which for the same signal will read very differently, and are shown here in decreasing order of maximum reading for the same signal:
dBC Peak
dBC Average
dBA Peak
dBA Average
Then we get into the actual integration time of both the "peak" and "average" functions, which will further skew the numbers.
If we're going to have any meaningful comparison, how about we all stick to the same type of measurements? And, as long as we're at it, for any comparison to be valid we should all use the same test signal.
So let me suggest we start this over. This time, set your volume control where you usually listen, then using an SPL meter set for dBC Average (slow), play this file (burned to a disc if you need to) and record the result, post it here.
This is a 15 second -20dBFS uncorrelated pink noise FLAC file:
https://filetea.me/t1saMckK415SgxTXEQRyHYqMw
This is a 15 second -20dBFS uncorrelated pink noise .mp3 file:
https://filetea.me/t1se3gnsX8jSo2x5HDwp4eNIg
Use whatever's easier.
If we now all measure our systems the same way, we'll have something we can talk about. True, we'll still have meter calibration issues, but that shouldn't be more than +/- 2dB. What we will have is an average level measurement with a -20 signal. Since we know that film tracks are mixed at 85dB SPL with 20dB of headroom, we know that we an get peaks up to 105dB SPL before we run out of undistorted data. If our test files measure at 85dBC, then we know our maximum peak will be at 105dBC, etc. This ignores woofer cal to 115dB, of course, but that's not about loudness anyway.