Generally they don't measure in a box, they are measured in an open baffle arrangement ... the speaker is mounted on a baffle of fairly large size, large enough so the backwave doesn't arrive soon enough to affect the direct response with the microphone 1 meter from the driver centreline, and with a fast sweep tone in anechoic conditions.
That's why you see such specifications as "free air resonance" and the like.
Also, mid and high frequencies are not measured the same way as low frequencies, as the wavelength becomes large enough at low frequencies to require a different technique (typically, nearfield measurement instead of 1 meter away) and the two graphs blended. Home speaker builders, not owning anechoic chambers, generally measure outdoors to eliminate room reflection affecting the response.
What happens when you mount it in an enclosure is what those Thiele-Small parameters are for.
If the speaker measured the same while holding it in your hand, with the microphone "some distance" away in a closed room, it would be a miracle.
There is always some unit-to-unit variation in cone drivers, looking at your graphs, the speaker seems to be fine as it tracks pretty closely in the areas where your technique is at least sort-of-correct. You should always break in cone drivers before measurement, they are stiff from the factory in every case.