1. Patients launch a $1.27 million crowdfunding campaign for ME/CFS gut microbiome study.
    Check out the website, Facebook and Twitter. Join in donate and spread the word!

Uncovering the real size of the Rituximab effect

Blog entry posted by oceanblue, Jan 15, 2012.

People who followed the Rituximab thread closely may have seen this already

Reported results appear unimpressive
Despite all the hooplah from patients and authors around last autumn's Rituximab study, the published results were not that striking. The primary endpoint of fatigue score at 3 months was negative, and even the highest mean fatigue score (at 6-8 months), of approximately 3.9, equates to only a 'slight improvement' (see fig 2B). The problem was that for most patients the effect was only transient, and the time to maximum effect varied from patient to patient: if some patients peaked at 3 months and others peaked at 9 months then that will bring down the average at any particular timepoint.

Finding the real size of the effect
What would give the clearest picture of the size of the Rituximab effect would be to see the average of the peak improvement for each patient. In fact, the study does give this information - which suggests startling improvements with Rituximab - but it's buried in Table 4 and needs a bit of decoding. Bear with me.

SF36 Physical Function is the key measure
Table 4 shows SF-36 baseline scores and maximum change (as a percentage) - and that's nearly the figure we really want. The SF-36 is a questionnaire that is widely used to measure different aspects of patient health, both physical and mental. The questionnaire uses a 0-100 scale, with 100 being the best and 0 being worst. Physical Function (PF) is the most interesting figure as impaired function is a defining feature of ME/CFS and has been used in several clinical trials including the 2 largest ever conducted, PACE (which found roughly an 8 point gain for CBT/GET) and FINE (which found no gain at all).
So here are those physical function scores (as per Table 4):
Rituximab group: baseline = 34, max change =39% (est.* max score = 47.3) gain = 13.3
Placebo group: baseline = 35, max change = 11% (est.* max score = 38.9) gain = 3.9

* read the small print box if you want to know the geeky details of the estimates

Now we are getting somewhere: the peak gains for Rituximab patients were 39% vs 11% for controls, a net gain of 28%, or an improvment of 9.4 points. Not bad, but also misleadingly low.

Last bit of geekery, honest

The Rituximab paper uses something called 'norm-based scoring', which basically relates the actual 0-100 score to the average score in a population. In the case of someone on the average populaton score of 86, there score would be 'norm-transformed' to 50. Huh? Confusing, isn't it? So for clarity I've transformed the scores back to the original 0-100 scale.

Those crucial physical function scores (The proper ones):
Rituximab: baseline 44.9, mean max score 76.5; gain = 31.6 (or 70%)
Placebo: baseline 47.3, mean max score 56.5; gain = 9.2 (or 19%)

Now we see that relative to placebo group, the Rituximab group made a peak gain of 22 points (or 50%). Woohoo! That is something to get excited about, and rather more interesting than the 'slight improvement' reported in fatigue scores. Remember, PACE only managed a gain of around 8 points on the same scale (scored on the same 0-100 basis), and was an unblinded study with no placebo control - unlike the Rituximab study - which means there is a danger of self-report bias. However, note that these Rituximab gains were only temporary.

To really push these figures, you could allow for the fact that 4 of the 13 rituximab patients with SF36 data were non-responders so the responders improved by even more. Assuming non-responders improved as placebo patients gives a SF36 Physical Function score of 86.4 for responders, and 85 is probably an acceptable score for healthy (it's the PACE protocol recovery threshold).

Using peak scores for responders only is possibly cherry-picking in the extreme (particularly as some of these figures are estimates, calculated from the available data). Nonetheless, it does suggest a dramatic response to rituximab in this study, albeit temporary. Now all we need is replication of the findings.

Credits: Thanks to Dolphin and Snow Leopard who unearthed crucial parts of this information
  1. oceanblue
    Ah, thanks very much for that Ocean. That explains the study for me at last!
    I haven't been following the Rituximab thread, but I had been wondering why the results of the study were so confusing.
    The results looked good when described in words, but the statistics didn't look impressive at all.
    So thanks again for explaining the reasons for this.
    Bob
    Thanks, Bob! Think you described the situation perfectly in the line I highlighted.
  2. Bob
    Ah, thanks very much for that Ocean. That explains the study for me at last!
    I haven't been following the Rituximab thread, but I had been wondering why the results of the study were so confusing.
    The results looked good when described in words, but the statistics didn't look impressive at all.
    So thanks again for explaining the reasons for this.
    Bob
  3. oceanblue
    The PACE gain was not norm-based but uses the traditional 0-100 scale, ie the same one that showed a 22 point gain for Rituximab versus placebo.

    There's an explanation of norm-based scoring here, but basically:
    In [Norm-based scoring], each scale is scored to have the same average (50) and the same standard deviation (10), meaning each point equals one-tenth of a standard deviation. Without referring to tables of norms, this method makes it is clear that whenever an individual respondents scale score is below 45, or a group mean scale score is below 47, health status is below the average range. As shown in Figure 7.3, with norm-based scoring, differences in scale scores much more clearly reflect the impact of the diseasein this example, the impact of asthma. Using NBS, clinicians can more quickly and appropriately interpret the effect of asthma on an SF-36v2 Health Survey profile.
    Norm-based scoring is helpful when you are looking at the results of several different SF-36 subscales e.g. Physical Function, Pain and Social Function: you can quickly see which scores are normal (i.e. close to 50) and which ones abnormal (well below 50). However, it's confusing when trying to understand changes within one subscale eg difference in Physical Function between a trial drug and placebo.

    Hope that helps. Or maybe it's now clear why I didn't try to explain it in the first place!
  4. Purple
    Thank you Oceanblue. I will have to study in more detail what 'norm-based' means to fully understand. Was the PACE trial 8 point gain for GET/CBT norm-based? (And if not, what would be the norm-based number?)
  5. Dolphin
    Credits: Thanks to Dolphin and Snow Leopard who unearthed crucial parts of this informationThanks oceanblue.

    Well done on your own work on this and for summarising it here. :thumbsup:
    Your blog should turn out to be a useful resource! :Retro smile: