I can't remember has this been highlighted or not:
One of their references is:
Guyatt GH, Osaba D, Wu AW, et al. Methods to explain the clinical
significance of health status measures. Mayo Clinic Proceedings
2002; 77: 371–83.
Free at:
http://www.mayoclinicproceedings.com/content/77/4/371.long
The proportion of patients achieving a particular benefit,
be it a small, moderate, or large difference, is therefore
much more relevant than a mean difference from the
clinician’s point of view and less likely to mislead. To
calculate the proportion who achieve a MID, one must
consider not only the difference between groups in those
who achieve that improvement but also the difference between
groups in those who deteriorate by the same amount. (we weren't given that proportion (i.e. those who got worse) in the paper when they told us how many went up by 8 points on the SF-36 PF and/or by two points on the Chalder fatigue scale)
One must therefore classify patients as improved, unchanged, or deteriorated. In a parallel group trial, the subsequent calculation is not altogether straightforward, and 1
approach involves assumptions about the joint distribution
of responses in the 2 groups.13 Statisticians are developing
alternative approaches to this problem, several of which are
likely to prove reasonable.58 What is not reasonable is
simply to present mean values without taking the second
step that is necessary for clinicians to interpret clinical trial
results effectively.
Distribution-based methods have, in general, 2 fundamental
limitations. First, estimates of variability will differ
from study to study. For instance, if one chooses the between-
patient standard deviation, one has to confront its
dependence on the heterogeneity of the population under
study. If a trial enrolls an extremely heterogeneous population,
an important effect may be small in terms of the
between-person standard deviation and thus judged trivial.
The same effect size, in a trial that enrolls an extremely
homogeneous population, may be large in terms of the
between-person standard deviation, and thus judged extremely
important. The true impact of the change remains
the same, but the interpretation differs radically.
There are at least 2 ways to deal with this problem. One
is to choose the variability from a particular population,
such as the standard deviation of a measure when applied to
the general population at a point in time, and always refer to that same measure of variability. The second is to choose
the standard error of measurement (which we will discuss
subsequently), which is theoretically sample independent.
[..]
BETWEEN-PERSON STANDARD DEVIATION UNITS
The most widely used distribution-based method to date is
the between-person standard deviation. The group from
which this is drawn is typically the control group of a
particular study at baseline or the pooled standard deviation
of the treatment and control groups at baseline. As we have
mentioned herein, an alternative is to choose the standard
deviation for a sample of the general population or some
particular population of special interest, rather than the
population of the particular treatment study under consideration.
An advantage of this approach is that it has been
applied widely in areas of investigation other than QOL.
In case people forget, they chose the baseline standard deviations rather than population standard deviations.