I wonder too. A few years ago I did try to find info on exactly how it is done, but didn't find any. I can speculate on how it might be done though.
A cd player reads 2,352 bits at a time. After all the redundancy and circ bits are removed, you are left with the actual pcm samples. The extra 4 bits would have to be allocated among those samples because if you play or rip the cd, the result sounds like a regular cd. So maybe it uses every 4 samples to form one, using the least significant bit of every sample as an additional bit for a 20 bit sample. That might work because in the case where you don't have an hdcd decoder, the least significant bit being 0 or 1 won't make a difference in what you hear.
Using 4 bits for brevity:
sample1: 111
1 (15)
sample2: 101
0 (10)
sample3: 110
0 (12)
sample4: 111
0 (14)
sample5: 1011 (11) - accumulate the lsb from the prior 4 samples and use them to form an 8 bit sample:
1000 1011 (139). If you didn't have an hdcd decoder and the lsb of each sample didn't really 'belong' to that sample, it wouldn't make a difference in the sound because changing the value of sample1 from 15 to 14 (change the lsb to zero instead of 1) won't be audible. [of course in my example, sample values that low are inaudible anyway.]
It may instead use the subchannel bits (q,r,s,t etc) to store the extra bits because they aren't generally used for audio cds.
Who knows, I'm just speculating. Being patented, we may be able to get some info by using
http://www.uspto.gov and looking for hdcd patents.