1. Patients launch a $1.27 million crowdfunding campaign for ME/CFS gut microbiome study.
    Check out the website, Facebook and Twitter. Join in donate and spread the word!
Brain Cells Making us Sick? The microglia connection in ME/CFS & Fibromyalgia
Simon McGrath looks at theories that microglia, the brain's immune cells, could trigger and perpetuate the symptoms of ME/CFS and fibromyalgia.
Discuss the article on the Forums.

'Recovery' from chronic fatigue syndrome after treatments given in the PACE trial

Discussion in 'Latest ME/CFS Research' started by Sam Carter, Jan 31, 2013.

  1. Esther12

    Esther12 Senior Member

    Messages:
    5,105
    Likes:
    4,905
    And the Rituximab was a much smaller, early study.

    A less pleasing comparison for Rituximab would be with early CBT trials which seemed to have serious flaws, but that there researchers claimed showed really big improvements. Hopefully the double blind nature of the Rituximab trial will mean that it's more likely to lead on to something worthwhile.

    That's increasingly what I find myself thinking with the biopsychosocial stuff. If they were cheeky scamps in a novel I'd be impressed and respectful of their daring. Unfortunately they're real people who are in important positions of influence over how I am treated, which makes it all rather less fun.

    Also - they've been making claims to patients about 'recovery' for a rather long time. But it's only after they saw how bad the results from PACE were that they decided they had to come up with a new definition for it? What did they mean by 'recovery' prior to this new paper? What did they think their patients thought they meant? I think it's fair to assume that they were using the definition of recovery which they laid out in their protocol, and that's one of the reasons why it is so important that this data is released.
  2. Dolphin

    Dolphin Senior Member

    Messages:
    6,472
    Likes:
    4,755
  3. biophile

    biophile Places I'd rather be.

    Messages:
    1,350
    Likes:
    3,993
    Additional assessments ...

    Bimodal scoring for CFQ (fatigue) makes recovery thresholds somewhat unreliable

    The CFQ has 11 questions. There are 4 answers: "less than usual", "no more than usual", "more than usual", and "much more than usual" (different wording for the memory question ie better/worse rather than less/more).

    Bimodal scoring means either "less/better than usual" and "no more/worse than usual" score 0, while either "more/worse than usual", and "much more/worse than usual" score 1. Hence the range is 0-11. Likert scoring means the answers are scored 0,1,2,3 respectively, hence the range of 0-33 with 11 as a neutral score.

    PACE required >=6/11 for trial entry and apparently therefore <6/11 for recovery from Oxford criteria CFS. However, it is possible to be at 6/11 or so points at entry, get worse in several questions but better in one or two questions, then be classed as recovered from CFS!

    Comparing the old with the new

    All these were compulsory. This was abandoned. In the post-hoc definition of recovery, only the "normal range" (which overlaps with entry criteria for "disabling fatigue" and even allowed a participant to be worse) is compulsory.

    The next (optional) criteria in the composite for recovery is no longer meeting Oxford criteria, which apparently (not absolutely clear) also includes scoring <=5 on the CFQ (bimodal scoring) for fatigue, and scoring >=70 in physical function. In other words, it is possible to be no longer meeting Oxford criteria because of these thresholds, not necessarily because the participant would not be arbitrarily diagnosed with Oxford criteria elsewhere if these thresholds were not bolted on. Furthermore, Oxford requires fatigue to be the only principle symptom, so if other symptoms get worse than fatigue, you cannot met Oxford criteria? Adding this on does almost nothing to the recovery rates despite presumably raising the threshold for good physical function by 10 points, which seems strange (unless they did not do this). Also note that scoring 4 or 5 in fatigue was regarded in the original protocol as "abnormal" or excessive fatigue.

    The next (optional) criteria in the composite for recovery is a clinical global impression (CGI) score of 2 "much better" or 1 "very much better". Adding this on does reduce the recovery rates somewhat, suggesting that some of the previously recovered (according to the earlier criteria) did not feel much better or any better at all. CGI would appear to be the strictest criterion, even though in the original protocol it was even stricter ("very much better")?

    Further requiring no longer meeting CDC criteria for CFS and/or no longer meeting London criteria v2 for ME does almost nothing to the recovery rates and seems superfluous now anyway, because Oxford is the broader definition, and those meeting CDC or London criteria but not Oxford at baseline were excluded. Also, they did not use the CDC criteria properly anyway in their definition for recovery (only required symptoms to be present or absent for 1 week instead of 6 months).

    In the original protocol, there was also a "positive outcome" of either <=3/11 absolute score or 50% relative reduction in fatigue, or either >=75 absolute score or 50% relative increase in physical function. These were the weakest thresholds of improvement for fatigue and physical function in their original protocol, and chosen to be a significant difference from the entry criteria. PACE originally expected 60% of the CBT group to reach this threshold. It is ironic that the highest thresholds for these measures in their post-hoc definition for complete recovery are lower than their lowest thresholds for these measures in their original protocol. They were confident enough back in 2007 to use much stricter figures. This was abandoned and even FOI requests have failed to secure this data. What happened?

    Why the composite criteria for recovery is not "conservative"

    Using a spectrum of increasingly stricter composite criteria for recovery is fine, but their strictest criteria is not good enough and falls far short of the original protocol.

    Although some would have improved more, it was possible to be completely "recovered" by improving one increment in physical function (5/100 points), and (apparently) improving one increment in fatigue score (1/11, even though it may have worsened overall, as per the above explanation under "Bimodal scoring for CFQ ..."), as long as you also scored feeling "much better" on the CGI scale. Their post-hoc threshold for recovered fatigue can be regarded as abnormal or excessive fatigue in the original protocol. Data on employment, welfare, walking distance test, etc, suggest no significant objective improvements. Coincidently this data was ignored. PACE was non-blinded with different levels of encouragement and optimism between groups, and the NNT of 7 is within range of a small placebo-response. APT was not best representative of pacing, and the SMC group contained elements of pacing.

    The vast majority of healthy age-matched controls to middle-aged CFS patients would be scoring 90-100. PACE Trial participants were on average 39 years old at followup, and were ill for an average of 3 years at baseline. Almost all of them should be scoring 90-100 if recovered, not 60 or 70 or so. 70 means you have between, some limitations on most, or major limitations on three, of the ten questions asked about physical function.

    Why aren't you naughty recalcitrants ditching pacing and celebrating the great White et al hope of CBT/GET?
    Simon and Dolphin like this.
  4. biophile

    biophile Places I'd rather be.

    Messages:
    1,350
    Likes:
    3,993
    Repeat of the "normal range" stunt accompanied with repeat of the working age population "descriptive error"?

    Who remembers that White et al blundered into labelling a general population a working age population back in their 2011 paper in the Lancet, and later admitted to it in their authors' reply? It appears that they have done the same blunder again!

    White et al claim that >=85 points in physical function would exclude "approximately half" of a working age population, then proceed with giving data for a general population from Bowling et al which have a mean score close to 85. Why was a working age population even mentioned at all when they did not use one? The mean(SD) score of the working age population from Bowling et al is about 90(18). The histogram demonstrates that most people in the general population are scoring 95 or 100 points (100 being the ceiling or top box score). Furthermore, unpublished data (personal communication) suggests ~72% in this same general population are scoring >= 85. This percentage would be higher in a working age population, and much higher in a healthy age-matched population (eg 95%).

    So how can >=85 exclude "approximately half" of a working age population when it only excludes about 1/4 of a looser generation population which included the elderly and those reporting illness? It seems that White et al were unaware that the dataset is heavily skewed away from the mean, and sloppily assumed that the mean (average score) and median (middle score) were the same. Why else would they be assuming that 85 is around the 50th percentile (when its actually much closer to the first quartile, or 25th percentile, which is actually used in some studies as a threshold)?

    White has previously co-authored a paper on recovery in CFS after CBT, which was even cited in the PACE recovery paper. It suggests he should be aware that physical function scores have a non-normal distribution. Also in that paper, the same mean minus 1 SD rule was used to give a physical function threshold of >=80, not >=60! Ironically, the PACE recovery paper claims to be more conservative than the previous paper on this issue (which is clearly false).
  5. Dolphin

    Dolphin Senior Member

    Messages:
    6,472
    Likes:
    4,755
    I like lots of this but a couple I thought I'd highlight.

    Yes, good spot.

    The question was:
    (Apologies if this point has been made before)

    The percentages who satisfied the recovery criteria incl. not having Oxford criteria before and after this stage are:

    APT: 15% -- > 8% (so 43% of all of them lost)
    CBT: 28% --> 22% (so 21% of all of them lost)
    GET: 28% --> 22% (so 21% of all of them lost)
    SMC: 14% --> 7% (so 50% of all of them lost)

    This seems it could be, although we can't be sure, another example of reporting bias: A CBT or GET person might say "much better" (and hence counted in the recovered group) when a APT or SMC-only participant might put "a little better" (and not be counted in the recovered group) for a similar change.

    In case people are confused by the fatigue bit, for bimodal scoring having a symptom "more than usual" and "much more than usual" are counted the same. So you could have a few symptoms going from "more than usual" to "much more than usual" but that wouldn't score any different. Have one symptom go from ""more than usual" to "no more than usual" and one could improve by 1 point (of 11) even though in reality your fatigue symptoms are in total worse.
    Simon, Valentijn and biophile like this.
  6. biophile

    biophile Places I'd rather be.

    Messages:
    1,350
    Likes:
    3,993
    I deleted my posts which were commenting on the ones that Dolphin just scrapped.

    I will take another look and re-post later.
    Dolphin likes this.
  7. Dolphin

    Dolphin Senior Member

    Messages:
    6,472
    Likes:
    4,755
    (This is probably restating other points to an extent)
    I can't imagine most funding bodies would approve funding to test whether GET, CBT, APT or SMC-alone brought about recovery based on the criteria used in this paper.

    Also, it's a pity the journal's peer reviewers didn't reflect on this.
    Purple likes this.
  8. biophile

    biophile Places I'd rather be.

    Messages:
    1,350
    Likes:
    3,993
    Individual criteria (Table 1a) ...

    Within normal range for CFQ fatigue (<=18/33 Likert) : APT 22%, CBT 41%, GET 33%, SMC 21%.

    Within normal range for SF-36 physical function (>=60/100) : APT 35%, CBT 52%, GET 53%, SMC 41%.

    Not meeting Oxford criteria for CFS : APT 43%, CBT 54%, GET 56%, SMC 41%.

    CGI score of 1 or 2 : APT 30%, CBT 40%, GET 40%, SMC 25%

    Composite criteria (Table 1b) ...

    Normal range in both CFQ and SF-36/PF : APT 16%, CBT 30%, GET 28%, SMC 15%

    And not meeting Oxford criteria for CFS : APT 15%, CBT 28%, GET 28%, SMC 14%.

    And CGI score of 1 or 2 : APT 8%, CBT 22%, GET 22%, SMC 7%.

    [edit: Oxford criteria requires both physical and mental fatigue, present >=50% of the time. Is there any PACE data on why patients no longer met Oxford criteria? FWIW, the basic inclusion criteria was:]
    That was a quote was from within the main section, "Domains, measures and criteria for defining recovery". This particular subsection on the Oxford criteria gives the impression that these were bolted ad-hoc onto the Oxford criteria for the purposes of recovery, although it is not absolutely clear, at first I did not think so but then I did think so (tentatively). This would mean that no longer meeting Oxford criteria would also mean scoring better than >=6/11 in fatigue and >=65 in physical function. However, when looking at the figures in Table 1, it is difficult to know what happened.

    No longer meeting Oxford criteria is noticeably more common that being within normal range, and adding the former onto the latter barely changes the recovery rates at all, which oddly suggests that no longer meeting Oxford criteria is significantly less strict than being within normal range. Yet for physical function, >65 i.e. >=70 is definitely higher or more strict than >=60; although for fatigue, <6/11 i.e. <=5/11 is anywhere between 0-21 in Likert scoring (accurate translation is impossible without original answers), so is possibly higher or less strict than <=18/33.

    Using GET as an example, 56% no longer met Oxford criteria for CFS ("no longer", since all did so at baseline, unlike for CDC and London) but only 28% or half of these were also within so-called normal range, suggesting that the remaining 28% or half of GET participants who no longer met Oxford criteria were still not within normal range.

    How can participants have clearly abnormal levels of fatigue and disability (even according to the questionably low thresholds used in PACE, not to mention the much stricter ones which should have been used) despite no longer meeting Oxford criteria for CFS even though they did 52 weeks ago?

    Dolphin later mentions that participants could have had OKish levels of one but not the other, explaining the differences. Table 1 suggests that "normal" fatigue only was less common than "normal "physical function only. He also previously raised the possibility that some participants may be failing to meet Oxford criteria at followup for reasons other than their levels of fatigue and physical function, for example fatigue no longer being their only principle symptom (but may still be rather significant). This could mean that any thresholds bolted onto the Oxford criteria, if that indeed occurred, are negated from requirement anyway, but when added onto the normal range criteria, obviously such participants must have scored <=18/33 for fatigue and >=60/100 for physical function to remain classified as recovered. It is still not clear whether no longer meeting Oxford criteria also meant scoring <6/11 in fatigue and >65 in physical function.
    Simon likes this.
  9. Dolphin

    Dolphin Senior Member

    Messages:
    6,472
    Likes:
    4,755
    The point is that some could have "good"/"lowish" fatigue or "good levels" of physical functioning/"lowish" levels of disability but not both and so wouldn't be in the 28% that are in the normal range for both. Those could make up most or all of the difference between 28% and 56% (if it was all the difference, parts (1) and (2) of the Oxford criteria are largely irrelevant at 12 months).
    Simon and biophile like this.
  10. Dolphin

    Dolphin Senior Member

    Messages:
    6,472
    Likes:
    4,755
    Moving away from the points about specific therapies, I thought it would be interesting to put a number on the effect of the CGI criteria

    Before the CGI criteria (the last one to add on), 15 +28 + 28 +14 =85 were being classed as in the recovered group.

    After the CGI criteria were used, it was 8 + 22 + 22 + 7 = 59 were being classed as in the recovered group.

    There was one missing score, although it's not clear if this person was in the 85 that were being classed as recovered or not.

    So 25/84 = 29.8% or 26/85 =30.6% of the people in the trial were going to be classed as recovered except they didn't put one of the top two responses above.

    As has been highlighted, this suggests that the other parts of the recovery criteria are too lax: if you're truly recovered, you shouldn't really be answering one of the bottom five responses to that question i.e. imagine you had a really tough "recovery criteria": all or virtually all of the people satisfying that would put "Very much better" or "much better".

    Also, one could easily imagine that if it was just restricted to a CGI response of 1, a lot of the remaining 59 would be knocked out of recovery.
    Valentijn likes this.
  11. Firestormm

    Firestormm Senior Member

    Messages:
    5,817
    Likes:
    5,919
    Cornwall England
    Morning,

    I think I would agree. Jeff's comment appears in keeping with what has been said on this thread about those aspects of the results and methodology.

    What did you think of Chris's reply:

    We are like it or not at a distinct disadvantage in the absence of any bio-markers or 'test' for this condition or even variants of it. So long as it can be demonstrated (and not necessarily objectively) that patients no longer 'meet' the entry criteria used to diagnose CFS/ME then it will continue to be utilised as a measure of success (or not).

    Of course the authors had to acknowledge that as CFS/ME is widely accepted as a 'fluctuating condition' even their assessment of 'recovery' could be time-sensitive. And this acknowledgement will be seen as perfectly reasonable. What this means is of course that a patient deemed 'recovered' one day can be diagnosed again the next.

    I think any response to the authors will at least need to address this 'failing' in diagnostic capability. And I don't think we can always argue that the specific criteria used are at fault. This can be taken advantage of or be seen as a disadvantage whether we are trying to assess Ampligen, Rituximab or indeed GET and CBT.

    It is a universal failing in my own mind and whilst we can rightly critique the recovery methodology, and lack of objectivity etc. we do need I think to pay fair attention to the other methods used. Otherwise any reply from the authors will include a similar comment to Chris's. If you see what I mean?
  12. user9876

    user9876 Senior Member

    Messages:
    684
    Likes:
    1,549
    They hide behind layers of abstraction within the paper where most readers do not go down a layer and try to understand what is actually going on. So they use scales that are not scales, they use parametric statistics when it is inappropriate and they use unreliable diagnostic criteria.

    Look at the oxford criteria a number of things suggest it is not a reliable test. Firstly look at the number of people in all groups nolonger meeting the criteria. Its above 40% for all groups yet improvements in other measures are small. This should be a red flag to suggest that something might be wrong with the test. The Oxford test also ruled out a largish number of patients from entry - hence perhaps it just doesn't describe patients very well.

    Then you need to look at how the test was conducted. It was a unblinded test. The question thus comes is how many people were boarderline on the oxford test to begin with. Have minor shifts caused movement over a threshold with further bias being introduced due to not being blinded.

    The actual test doesn't just look at fatigue but says fatigue must be the most significant symptom. So if a patient describes pain as the most significant symptom (don't forget fibromyalga patients were included) then they would no longer count. If patients are told you are recovered you are no longer fatigued perhaps they describe their being ill in different terms. I'm not convinced that fatigue is a good description - I'm reminded of Toni Bernhard's article "I'm ill not tired". One of the big issues with the Oxford test would hence be a lack of being specific about what fatigue means and hence giving the ability to reinterpret symtoms under different names. For example a patient might say that they are fatigued for less than 50% of the time but that they seem to get a lot of bugs (not including this in their 50%).

    So the test is just crap. Its not well defined and its not a good representation of the illness and it lacks accuracy. So it's not a good basis to judge recovery. However if you read the paper at face value it just says not longer diagnosed!
    Svenja, Dolphin and Valentijn like this.
  13. Dolphin

    Dolphin Senior Member

    Messages:
    6,472
    Likes:
    4,755
    It is very debatable that a change by a score that is less than the clinically useful difference (which was calculated to be 8 points for the SF-36 PF) should be sufficient to count as "recovery".

    Similarly scores such as 70 on the SF-36 PF shouldn't count as recovery.

    For this and lots of other reasons (I'm not going to list them all now), the results of the study should not be seen as showing recovery.

    It should be recognised for what it is, spin and hype.
  14. Graham

    Graham Senior Moment

    Messages:
    776
    Likes:
    1,831
    Sussex, UK
    Sorry to ask a very basic question, but they have removed my brain and replaced it with a bowl of porridge. When you say a score of 85 on the sf-36, I know it started off in the protocol as a score of 75 being a target (although they didn't specifically mention recovery as such as far as I can tell), and the entry criteria was set at being less than 75. Then there was a period of "negotiation" which ran along the lines of a target being suggested by the overseers that would be 10 points above the entry level, ending up with an agreed entry level of 60 or less, and a presumed target of 75 which never occurred because it was switched to 60. I think that's right.

    So, is it fair to quote 85 as a target, or is the original 75 more appropriate as it was their original bid? And where do they actually mention "recovery" rather than just targets?
  15. Enid

    Enid Senior Member

    Messages:
    3,309
    Likes:
    838
    UK
    PQCE and recovery - someone must be joking here !!!
  16. Dolphin

    Dolphin Senior Member

    Messages:
    6,472
    Likes:
    4,755
    I think you might be mixing up the positive outcome (which mentioned 75), and recovery (which mentioned 85) which were separate outcome measures in the published protocol.

    The PACE Trial Identifier http://bit.ly/14A0T3z (prepared years before the official protocol) didn't give a definition for recovery.
  17. In Vitro Infidelium

    In Vitro Infidelium Guest

    Messages:
    646
    Likes:
    280
    But is it not the point that Firestorm makes in quoting the BS thread, that otherwise sceptically inclined people are not willing to take the charge of 'spin and hype' at face value. I accept that is massively frustrating, but 'should' rarely informs what people actually do.

    My own sense about this ( I haven't had the energy to deconstruct the 'recovery' paper stats but they seem extremely difficult to expose as bogus) is that rather than identifying the PACE 'process' as 'spin and hype' - setting it in a context of "£7 million of public money spent on selling a dubious 'franchise' to the NHS" would be a more accessible presentation (though not of course acceptable at BS). The battleground now (at least in England) is within the Clinical Commissioning Groups - it is there that concern about 'mis-selling' of the PACE franchise is likely to evince a negative response. The CCG managers will be very keen on the PACE franchise statistics - recovery and removal from treatment lists will allow targets to be met. If the intention of M.E/CFS advocacy is to forestall progressive rolling out of the PACE franchise, then doubts will need be raised with the CCG memberships, otherwise the CBT/GET model will be enthusiatically adopted by the new commissioners on the basis of target meeting statistics. Simply saying it's all 'spin and hype' will not be enough - identification of the intention of the spin and hype (selling the franchise) will be needed to give the message traction.

    IVI
  18. Graham

    Graham Senior Moment

    Messages:
    776
    Likes:
    1,831
    Sussex, UK
    Thanks Dolphin. You are right: I am confusing the identifier with the protocol. Anyone want some porridge? I feel really dull and out of it at the moment.

    So instead of complaining about spin and hype, how about congratulating them on their success? That after a dozen or so sessions of CBT they managed to persuade 1 in 7 patients (out of a group in which half had psychological problems, mainly depression) to change how they recorded the severity of their symptoms. Of course, with objective data showing no change, this was of course a paper exercise, but that's what the whole thing was about - increased exercise.
    Svenja likes this.
  19. Dolphin

    Dolphin Senior Member

    Messages:
    6,472
    Likes:
    4,755
    My response to Firestormm could have been more nuanced and detailed - I wrote it quickly and should probably have put more time in to it. I accept the point that it will be necessary to prove that it is spin and hype. But I thought it was a good example of spin (and hence hype) that a change of less than the clinically useful difference could allow recovery. And I and others are already clearly trying to identify other examples and have already given some examples.

    It may be difficult to convince some, perhaps many people, that the recovery paper is "spin and hype" but I don't think that is any reason not to do it.
    Valentijn likes this.
  20. user9876

    user9876 Senior Member

    Messages:
    684
    Likes:
    1,549
    Are the PACE team going to franchise their statistical techniques so that commisioning managers can claim they are meeting all the targets they been set by lowering them and suggesting they are more conservative than previous targets.
    biophile likes this.

See more popular forum discussions.

Share This Page