
My protocol as follows:
1) connect the set as per the diagram above with the artificial antenna between the set and generator. For my tests I always use a 1N34A diode in the radio for consistancy between tests.
2) I typically measure my set at a nominal 1100Khz, so first set the signal generator to this frequency. I generally set the generator output at 2V rms peak to peak.
3) With the generator running, I tune the radio while watching the DVM output and peak the output as carefully as possible. I optimize the tap and other settings where possible at this stage for the highest signal output.
4) This step was added after considerable frustration in finding assymetrical results. After peaking the radio tuning, I return to the generator and adjust the frequency until the DVM shows the maximum output voltage. This is generally not the nominal 1100 kHz, but pretty close. This is the resonance frequency (f res) I was relieved to find that Lauter recommends the same protocol.
5) Record the frequency (f res), and radio output voltage (dc-Vout) from the DVM.
6) Multiply the dc-Vout by 0.707 (-3dB), then adjust the frequency of the generator to match the calculated Vout above (f high) and below (f low) f res, record these values. f high - f low gives the radio's bandwidth at that frequency. Radio Q = f res divided by the bandwidth.
7) For a nice graph I also take Vout readings with the generator set to frequencies 25 and 50 kHz above and below f res.
8) For set sensitivity, I take an expedient and non-standard approach. I divide the dc Vout at f res by the peak-peak rms V in from the generator (typically 2V rms), expressed conviently as percent.
So, how do the sets stack up? Following the protocalls outlined above I fill out a small spreadsheet for each set with the needed measurements and a clever graphic. I do this for the radio in one or more configurations and I will admit things are close, but not always optimized as best as possible. The following figure illustrates the output for my Homebrew set with the main coils spaced 18cm apart.

The upper left of the spreadsheet tabulates the mV output at a set of frequencies, and the calculated ratios and signal Db, plotted in the graph at the bottom of the sheet. On the upper right in a box is the summary data for the 3Db technique, the bandwidth, QL, and Sensitivity. I summarize all my sets in terms of QL vs Sensitivity.

The above figure plots set Q loaded versus the set sensitivity. One gets the first overall picture of the disconnect between sensivitity and selectivity, sets having one or the other, but never both! Kenneth Kuhn's Crystal Radio Engineering Web Book notes that crystal radio's at 1MHz will generally have loaded Q's between 100 (QLmax) and 20 (QLmin) with a QL of 50 to be looked for but not realistically to be had. The plot shows the majority of my sets at the low end of QL = 20. Recall that most of my sets are single tuned and/or single selonoid sets so this low Q is to be expected. These are clearly NOT DX high-performance rigs but rather, typical home-fun listening sets. The higher QL's for my Homebrew set result from its double-tuned circuit.

Photo of the lab with my signal generator, artificial antenna, and keithley bench meters.