1. Patients launch $1.27 million crowdfunding campaign for ME/CFS gut microbiome study.
    Check out the website, Facebook and Twitter. Join in donate and spread the word!
Part 2: Brain Cells Making us Sick? Messed up microglia could be driving symptoms
Simon McGrath looks at theories that microglia, the brain's immune cells, might be overactive and driving the symptoms of ME/CFS and fibromyalgia.
Discuss the article on the Forums.

PACE Trial and PACE Trial Protocol

Discussion in 'Latest ME/CFS Research' started by Dolphin, May 12, 2010.

  1. oceanblue

    oceanblue Senior Member

    Messages:
    1,174
    Likes:
    343
    UK
    Response bias? The graph

    I plotted SF36 scores against 6MWT scores to see if that would throw some light on response bias. If response bias happens, you would expect to see this at the end of the trial (52 weeks), when factors like wanting to support the therapist/doctor come into play but not at baseline, when particpants don't even know which therapy they will get.

    The graph shows a pretty consistent relationship between SF36 scores and 6MWT scores at baseline, shown by the blue trend line (and red squares). At 52 weeks there isn't such a clear pattern (green triangles), but the SF36 scores are higher than you would predict given the relationship between SF36 and 6MWT at baseline. It's as if the SF36 scores have been boosted a little - which could be the response bias.

    - WSAS is data for the 4 quartiles given in the Cella paper, using PACE baseline data for the Work and Social Adjustment Scale. It's the same basic data as PACE baseline, but chopped up a different way. Nb all these data points are group means (each group is around 150 people), not data for individual participants.

    respbias.jpg

    This is all speculation though and comments welcome.
    ps happy to supply excel file with data and graph if anyone wants it.
  2. Esther12

    Esther12 Senior Member

    Messages:
    5,175
    Likes:
    5,162
    Thanks for that graph OB. It really makes the point well.
  3. anciendaze

    anciendaze Senior Member

    Messages:
    859
    Likes:
    885
    I think oceanblue makes a good case for response bias, as a subjective effect. This doesn't seem as significant to me as the fact we have seen that at least one therapist made a determined attempt to keep a patient in the trial, to the point of violating ethics and the protocol, while those running the trial tolerated some 30% of those "completing" the trial without providing any objective data on improvement. How can you count a patient as completing without getting data from them? To my mind these were dropouts hidden under another name. Treat them as such, and any validity for improvements in objective measures vanishes.

    Apparently, proposals for objective measures were only used to get funding, and were later dropped as inconvenient.

    As always, I feel the distribution of activity scores in the general population is only approximately Gaussian in the sense that 85=95 (mean = mode). If I treat that distribution as a spike and a slab, I see a normal (Gaussian) spike of relatively healthy people with a mean of 95 and an SD of about 10, plus a slab of aging or ill people uniformly distributed across the range. The reason this tapers off at the low end is that the data are based on people visiting doctors, and those below some limit were not able to do so. These two parameters were the basis for the published criteria for recovery, as well as all measures of statistical significance. This emperor has no clothes whatsoever.
  4. Dolphin

    Dolphin Senior Member

    Messages:
    6,585
    Likes:
    5,191
    I think you're think of the Chalder Fatigue Questionnaire data rather that SF-36 normals which were based on postal questionnaires, and the like.

    The problem with the Chalder Fatigue Questionnaire population data was more that the people who didn't attend a GP in the previous 12 months were included, and that would likely be a particularly healthy group covering 20-30 (?) % of the population.
  5. anciendaze

    anciendaze Senior Member

    Messages:
    859
    Likes:
    885
    At this point it is likely my poor damaged brain is fried. The reasoning in this study resembles that used in old Star Trek episodes to dispatch unwanted robots. Personally, I'm very sure I was not talking about fatigue scores. The references they cited don't support the values used for mean and SD, unless mean and mode are quite different. However, I'll drop it as not worth the effort to understand.

    What do you think about the idea of patients who did not provide objective data for judging improvement being counted as completing the trial? Imagine running a race in which competitors could have either starting or finishing times without necessarily having both. (Oh, did I mention that they started separately?) You could then interview them to decide who won, based on their own opinions of their performance. I could imagine this crew nodding heads, and saying, "that sounds eminently reasonable."
  6. Dolphin

    Dolphin Senior Member

    Messages:
    6,585
    Likes:
    5,191
    I know you are generally talking about SF-36 PF scores. But I specifically replied to the bit where you say there is a problem with the population data (because I presume that is what you are talking about) as it is obtained from those visiting doctors.

    I remain convinced my point is correct. The reason I make such points is that I think it is important that we are exact in what we are saying. Many of us are writing to medical journals.

    However, you say you are very sure you were not talking about fatigue scores; if so, I will be interested in seeing the evidence.

    I have been a big proponent that this trial needed objective outcome measures. Indeed I remember being told on a list in 2003 the trial would be using actometers; then at some later stage, discovering they were dropped.

    Also, even if they had used actometers, they probably would have used them as secondary measures while I would have wanted them as a primary outcome measure. Also as part of the definition of recovery.

    It is interesting that they don't have 6MWDs for some groups; I have seen analyses called sensitivity analyses performed before where they look at various options e.g. "worst case scenario for missing values", "best case scenarios", etc. A slight problem we have here is that a lot of the people in the other groups didn't do it either. A lot of people in the world have somewhat disorganised lives or for whatever reason don't do all things they should e.g. if taking part in a research trial, do all the tests. It probably can't all be put down to them being made worse.
  7. oceanblue

    oceanblue Senior Member

    Messages:
    1,174
    Likes:
    343
    UK
    Chalder Fatigue Scale normative data is suspect

    PACE set a threshold of 'normal' fatigue as a CFQ score of 18 or less, based on a mean of 14.2 and SD of 4.6, which were taken from a 2010 study, Measuring fatigue in clinical and community settings. But these figures may well not be representative of the working age population.

    The way they selected the participants is complex, and the underlying data was collected published in 1994: Population based study of fatigue and psychological distress (T Pawlikowska, T Chalder, S R Hirsch, P Wallace, D J M Wright, S C Wessely), and this is where the holes start appearing.

    Crucial bit is point 4 if you're pressed for time.

    1. This does not appear to be a representative sample
    They mailed registered patients at several different types of practice, but made no attempt to match them to the population (vs Bowling who showed SF36 data was based on a cohort well-matched with census data).
    - They restriced it to patients age 18-45, though didn't explain why (though this particular feature is likely to bias towards healthier individuals).

    2. Low response rates could lead to biased findings
    The response rate was only 48%. After investigating non-responders they found that many had moved (a known problem with GP practices, esp in urban areas) and estimated the response rate from those who received the questionnaire was 64%. The issue is, were people who were less well, or fatigued, more likely to respond to questionnaires about fatigue and health?

    By comparison, the Bowling SF36 data used face to face interviews, with a 78% response rate and the SF36 qs were part of a much broader questionnaire including lifestyle and finance - so healthy people are less likely to ignore because it doesn't apply to them. The Jenkinson SF36 figures had a 72% response rate, but this is of those mailed. Let's say 5% of the orginal list they mailed was incorrect/moved (quite a cautious assumption from my direct marketing experience) giving a net response rate of 76% - and again this was part of a larger survey including lifestyle, reducing the chance of healthy people opting out.

    ETA: however, this study suggests that ill people might not be more likely to respond, though it does relate to questions on "subjective well-being (overall life satisfaction and self-assessed health)" rather than just health.

    3. Only participants who visited their GP were included
    To complicate things, Cella didn't use all the data from the original mailing. Instead, data was only used from respondents who subsequently visited their GP about a viral or other complaint and were selected as part of another study. So anyone who was very healthy and never visited their GP would not be included. Those who visited their GP more often would consequently have more chances to make this cohort than those who rarely visited their GP. All of this is likely to bias the sample against healthy individuals.

    Precise figures are not given for the original 1994 study but from the figures they give it looks like the mean is very close to 13.6, compared to the 14.2 quoted by Cella for his sub-group, suggesting at least some bias here.

    ETA I've found the fatigue case data for the Cella study (Postinfectious fatigue: prospective cohort study in primary care, p1335 under "stage 2 sample"): it gives 42.6% caseness, vs 38% for Pawlikowska, confirming the Cella cohort is more fatigued than the Pawlikowska one.

    4.Data from the original study indicate this is an unhealthy cohort.
    According to Pawlikowska, 38% of patients had a score about the original Chalder bimodal cut off of 3 (as used in the PACE protocol) and 18.3% of patients were substantially fatigues for 6 months or longer. Whoa, that looks unhealthy, esp as the paper quotes a 1990 paper that found only 10% of GP practice patients had fatigue for one month or more. I think there are some US studies indicating fatigue of over 6 months in the population is much less than 18%.

    So, I'm pretty fatigued now, and so are you if you've read this far. But it looks like PACE have been using highly unsuitable 'normative' data. Again.

    ETA: should mention that the Cella CFQ data is not normally distributed and therefore, like the SF36 data, is not suitable for use with parametic stats, such as the 'within 1 SD of the mean' formula used by PACE:
  8. Dolphin

    Dolphin Senior Member

    Messages:
    6,585
    Likes:
    5,191
    Thanks for doing the graph, oceanblue, it's great.

    It might be good if another graph was done where the line was extended. And if "error bars" for the response bias could be put in, it would be great but not so important.
    The equation of the line, y, (y=SF-36 PF) = (0.24616*6MWD) - 41.15385

    I think the response bias in the SF-36 PF function for GET may be similar to the APT and SMC response biases (but smaller than the CBT one) because the response bias may incorporated in the 6MWD; the GET participants may be more motivated to show how fit they are now and willing to push themselves further. They also would go for more frequent continuous walks of 6 minutes duration so would probably be better at not going too fast or too slow to get the optimum distance out of the 6 minutes.

    Move the 6MWD to the left a bit and one can get a similar distance above the line. Of course, there is no reason the response bias for CBT and GET should be exactly the same.
  9. anciendaze

    anciendaze Senior Member

    Messages:
    859
    Likes:
    885
    I will concede the point about the source of the data for SF-36.

    However, there is a major problem with the health of the comparable population, and I am puzzled by a statement you made earlier. According to those running the study, CFS sufferers have no organic disease, we are merely suffering from "false illness beliefs" and deconditioning. In such a case, it is wrong to compare their recovery process to those with serious conditions like heart failure or COPD. Likewise, there is no reason for us to accept performance limitations of people twice our age. You can't have it both ways, either we have a real, serious, organic illness, or we do not.

    My point here is that researchers counted them as completing the study, without objective data. This makes the relative importance of objective measures in the thinking of those running the test obvious.

    What can be seen, in studies that have included Actometer data, is that patients displace activity. If they are exhausted after GET, they are less likely to do a 6MWD. The reasoning here is that GET got higher marks, not by causing a higher percentage to decline the test, but by making those who declined the test more likely to perform poorly if they had taken it. We don't know, and nobody really wanted us to know.

    There was a selection effect by close to half of those approached to decline to participate, or to officially drop out. We then have another 30% of those remaining not providing useful objective data. The end result is that the objective part of claims can only apply to about 35% of those they intended to treat. Throw in major diagnostic confusion, and there is no clear idea of who, other than therapists, actually benefits.

    The end result is a kind of "you can't prove nuthin', copper" defense, which tells you the objective scientific data isn't worth much at all. This is especially telling when only group measures are reported. If the goal was to muddy the waters, this study would serve admirably.

    I sense you have an opinion this was not all that unusual for studies of psychiatric treatment. You may well be right. This is not a vote of confidence in the field.

    I know of one instance in which a repairman was checking out equipment used for ECT at a psychiatric hospital when one of the staff complimented him on the new equipment not causing the violent contractions seen with older models. He immediately checked, and found a safety plug had never been removed when it was unpacked. The new equipment had been in use for a year.
  10. oceanblue

    oceanblue Senior Member

    Messages:
    1,174
    Likes:
    343
    UK
    Your suggestion for GET being affected by response bias on the 6MWD is certainly possible, but I'm not sure if it's any more plausible than GET having a small but real effect on walking capability. And I would have predicted that SMC would have a smaller response bias than APT, but that doesn't appear to be the case. There were only 5 sessions of SMC with relatively low expectations and satisfaction, compared with 13 sessions of APT (plus 3 of SMC), with a strong therepuetic alliance and high satisfaction. APT is set up for response bias.

    It appears from the data that something is going on, but i'm struggling to see a clear, compelling explanation for what it is. Help welcome.

    I might have a crack at a deluxe graph in due course.
  11. Dolphin

    Dolphin Senior Member

    Messages:
    6,585
    Likes:
    5,191
    There is no reason it can't have been both an increase in GET plus a bit of response bias/training effect.
    For example, if the GET group had scored 352.88, they would have increased on the 6MWT by 40.879m (which is more than the other groups) but their SF36 PF value of 57.7 would have exactly the same inflation as CBT (12.215) over the predicted value.

    The SF-36 PF questionnaire asks about people's limitations. The APT group may recognise their limitations rather than have an inflated view of what they can do so that might be why it is not bigger than the SMC group.
  12. oceanblue

    oceanblue Senior Member

    Messages:
    1,174
    Likes:
    343
    UK
    It's possible, but I'm not convinced that the SMC group should have an inflated view of what they can do either - and the APT group still have all those extra sessions and strong relationship with the therapist. The problem with all of this is that are so many 'ifs' and 'buts' that while a plausible case can be made, so can a plausible counter case. I'd love to use this info in a wider context, eg a letter to a journal, but I don't think the evidence we have is clear enough as things stand.
  13. Dolphin

    Dolphin Senior Member

    Messages:
    6,585
    Likes:
    5,191
    It would be interesting to get/watch out for baseline data from other studies and see does the same equation hold. Although I don't think the 6 minute walking test has been used in many ME/CFS trials. Perhaps it was in the Jason et al. (2007) study.
  14. oceanblue

    oceanblue Senior Member

    Messages:
    1,174
    Likes:
    343
    UK
    Wessely talk featuring PACE trial

    "Health in mind and body: bridging the gap"
    http://www.foundation.org.uk/events/audios/audiopdf.htm?e=440&s=1200

    20 minute audio and slides, PACE bit starts about 10 mins in.

    Some bits I noted:
    - He said CFS wasn't down to personality
    - but perpetuation was down to behavioural and psychological factors

    - described PACE as the 'final definitive trial' and as 'one of the most beautiful behavioural medicine trials we have ever seen'. He showed the timeline fatigue graph (fig 2A in the paper) and called it a 'good result: we have improved - haven't cured - the physical health, the psychological health, the functioning etc of a large number of people'. I'm not sure that quite squares with what Michael Sharpe said in describing the results: that they needed to treat 7 people with CBT or GET for one to improve by a moderate amount.

    Enjoy.
  15. Esther12

    Esther12 Senior Member

    Messages:
    5,175
    Likes:
    5,162
    Thanks OB. Sometimes I get the impression that Wessely realises he's close to being a quack... not that time. If I was a naive medical student, I'd have been convinced by him.
  16. Sean

    Sean Senior Member

    Messages:
    1,177
    Likes:
    1,829
    Beauty is in the eye of the beholder.
  17. Angela Kennedy

    Angela Kennedy *****

    Messages:
    1,026
    Likes:
    152
    Essex, UK
    Yes, it is disturbing to see Wessely resorting to the word 'beautiful' as a rhetorical device to defend PACE!

    Ps, actually - what am I saying? It's ultimately disturbing to see this described by him as the 'final definitive trial'. THAT is wholly irresponsible.
  18. Bob

    Bob

    Messages:
    7,955
    Likes:
    9,889
    England, UK
    I was actually surprised to see that Wessely is promoting this trial as if he is proud of it.
    It shouldn't have surprised me, of course.
    But I thought that the authors were actually quite taken aback at how horrendously bad the results were, such that they had to fiddle with the protocol and only selectively report the results.
    I suppose that any 5m government-funded study into psychological 'treatments' is beautiful in his eyes, especially when he can claim that it was a successful study.
    Now I guess it's the job of our community to make sure that everyone else knows that this Trial was not successful, and the results not relevent for the ME community. Not an easy task, but a crucial one, in my opinion.
  19. Angela Kennedy

    Angela Kennedy *****

    Messages:
    1,026
    Likes:
    152
    Essex, UK
    I think you're right Bob.
  20. anciendaze

    anciendaze Senior Member

    Messages:
    859
    Likes:
    885
    I think that tells you not to hold your breath for any objective results in follow up. He clearly feels that showing you can influence opinions by talking to people for a year is the ne plus ultra of scientific research.

See more popular forum discussions.

Share This Page