'Recovery' from chronic fatigue syndrome after treatments given in the PACE trial

Graham · Nov 18, 2013

I have been playing around with the data from Bowling on the SF-36 scores, and it is hard to bring the targets down as far as they did without using the scores of people aged 85+. It will be interesting to see how they wriggle out of it.

Esther12 · Nov 18, 2013

Graham said:
I have been playing around with the data from Bowling on the SF-36 scores, and it is hard to bring the targets down as far as they did without using the scores of people aged 85+. It will be interesting to see how they wriggle out of it.

They did use the scores of people aged 85+.

biophile · Nov 18, 2013

Re normative SF-36 data, not only did PACE use people who were elderly (including 85+) and those impaired with chronic illness, but they even claimed that the sample they used was "demographically representative".

Bob · Nov 18, 2013

biophile said:
Re normative SF-36 data, not only did PACE use people who were elderly (including 85+) and those impaired with chronic illness, but they even claimed that the sample they used was "demographically representative".

Are we certain that they did not extract demographically representative data from the dataset?

Esther12 · Nov 18, 2013

Bob said:
Are we certain that they did not extract demographically representative data from the dataset?

No, but if they did so they ended up with the exact same figures one would get from just using everyone, which would be a bit of a coincidence!

biophile · Nov 18, 2013

Bob said:
Are we certain that they did not extract demographically representative data from the dataset?

Yes, because they later admitted to it : "we calculated the proportions within the general population normal ranges...". Instead of a "working age population" as initially claimed. A general population by definition includes the elderly and ill, there was certainly no selection or extraction of demographically representative data from that, and as Esther12 said, the figures are (entirely and undeniably) consistent with including the elderly and ill in their sample.

user9876 · Nov 19, 2013

biophile said:
Re normative SF-36 data, not only did PACE use people who were elderly (including 85+) and those impaired with chronic illness, but they even claimed that the sample they used was "demographically representative".

I assume the population they used in the ONS study was demographically representative or the overall population but the trial population was not. I don't think they ever claimed to have done age matching.

I seem to remember there are a few 'bad' data points in the ONS data which bring the results down for the overall healthy population. For example there was one person who could easily climb several flights of stairs but struggled with a single flight.

Graham · Nov 19, 2013

I have been playing around with the data from Bowling on the SF-36 scores, and it is hard to bring the targets down as far as they did without using the scores of people aged 85+.

What I didn't say was that it isn't clear how they combined the different proportions of people in the different age groups and whether they accounted for the 77:23 female:male bias. I have not yet been able to match PACE's figures. There were only a few aged 85+ in the sample; did they boost this proportion up to match population parameters? If so that was really cheeky, because if they were matching the results to population parameters, they should have known that they ought to have matched them to the PACE profile.

biophile · Nov 19, 2013

This is from the recovery paper : "The mean (S.D.) scores for a demographically representative English adult population were 86.3 (22.5) for males and 81.8 (25.7) for females (Bowling et al. 1999). We derived a mean (S.D.) score of 84 (24) for the whole sample, giving a normal range of 60 or above for physical function."

These exact same figures for males and females can be found in Table 3 of the Bowling et al paper, which are derived from the whole sample regardless of age or health status. Perhaps PACE had access to the ONS data too, but even just pooling together the mean(SD) scores from both groups, using this formula, results in a rounded mean(SD) of 84(24).

In their authors reply to the Lancet, PACE also describe it as a general population and admitted that they did not use a working age population as they previously claimed in the 2011 Lancet paper. It seems clear that they just used a general population which included the elderly and sick, without any further adjustments or matching.

Graham · Nov 19, 2013

Sorry Biophile. I'm being stupid as usual. I'm afraid I am going through a woolly-headed phase (or could it be a lifetime?). The mean of 84 and the standard deviation of 24 comes from simply combining the results of men and women in equal proportions, using the distribution of ages as follows: 16-24 9.5%, 25-34 19.3%, 35-44 14.9%, 45-54 13.8%, 55-64 13.8%, 65-74 13.1%, 75-84 13.8%, 85+ 1.7%. It didn't occur to me that with all their fancy statistics they would just add up the two totals! I should have known better.

Have you any idea how they got the figures of a mean of 85 and a s.d. of 15 that they used in the protocol? I can't get the mean and s.d. that low. (I've got the feeling that this has been covered lots of times before, but my mind has gone blank).

If it is any consolation, I am reading some old Terry Pratchett books that I read many years ago, and I can't remember ever reading them!

Esther12 · Nov 19, 2013

Have you any idea how they got the figures of a mean of 85 and a s.d. of 15 that they used in the protocol? I can't get the mean and s.d. that low. (I've got the feeling that this has been covered lots of times before, but my mind has gone blank).

We don't know where those figures came from, and it looked to me as if they were not exact.

Maybe they used a working age population for those figures, and rounded them up/down too?

Graham · Nov 19, 2013

I have tried using the Oxford data from Bowling, which is working age. I have tried using the overall data only selecting the working age, and I have tried using equal proportions for each age group in the working age range along with balancing the 77% female bias. None of it really matches up. In fact it feels as though the mean of 85 and the s.d. of 15 come from two separate considerations, unless, as is very likely, I am missing something.

Esther12 · Nov 19, 2013

Graham said:
I have tried using the Oxford data from Bowling, which is working age. I have tried using the overall data only selecting the working age, and I have tried using equal proportions for each age group in the working age range along with balancing the 77% female bias. None of it really matches up. In fact it feels as though the mean of 85 and the s.d. of 15 come from two separate considerations, unless, as is very likely, I am missing something.

Are there other data sets they could have used?

What about the means and SDs for the working age population in table 3? (I know that this is not the right way of working out the SD, but I wouldn't expect this to affect PACE researchers).

biophile · Nov 19, 2013

No worries Graham! Has happened to me often, especially in real life. I would be useless without note-taking.

This is what I think happened re 85(15)=70. It is just a generalization and an artifact arising from changes to the protocol's wording and possibly rounding off to the nearest 5 points.

The earlier trial identifier states that:

"We will count a score of 75 (out of a maximum of 100) or more as indicating normal function, this score being one standard deviation below the mean score (90) for the UK working age population.[29]"

This is citing Jenkinson et al (1993), normative data from a working age population only.

The 2007 protocol states that:

"A score of 70 is about one standard deviation below the mean score (about 85, depending on the study) for the UK adult population [51,52]."

Here they cite both Jenkinson et al (1993) and Bowling et al (1999), the latter being a general population.

The mean(SD) of working age populations in these studies is roughly 88/90(18) respectively, which does give a threshold of about 70 points. The mean of the general population from Bowling et al i.e. 84 is close to 85 but subtracting the standard deviation of 24 gives 60 not 70.

At some stage between the beginning and end they went from just citing a working age population to both working and general to just a general population, but then apparently forgot about it when claiming in the 2011 Lancet paper they were still using a working age population.

Graham · Nov 19, 2013

Thanks Biophile. I hadn't thought about them rounding the results off to the nearest 5, which makes sense when you think of the scoring system. I saw that the Oxford data in table 4 gave an overall mean of 88.4, s.d. 17.9, and when I adjusted it for the 77:23 ratio of women to men, it was 87, 17.6, which of course still gives the target of 70.

So, if they thought it appropriate to round these off, why did they quote the change needed for clinical improvement on the sf-36 as 8, an impossible change? Silly me, I'm expecting consistency. But then, I guess the report did have a certain consistency: the consistency of a dense pea-souper.

Gijs · Nov 19, 2013

The mean (S.D.) scores for a demographically representative English adult population were 86.3 (22.5) for males and 81.8 (25.7) for females (Bowling et al. 1999). We derived a mean (S.D.) score of 84 (24) for the whole sample, giving a normal range of 60 or above for physical function."

This score is not comparable with the study of White e.a. as i stated earlier and this study lacks a control group with a natural course. If White e.a. wants to compare his findings with Bowling he needs to adjust his data. At this point this is very bad science.

biophile · Nov 23, 2013

@Graham. Yes, the whole thing has been ridiculous.
There has been little more than spin, bizarre mistakes, and incompetence.

A working age population which included those with chronic illness was already a questionable source for healthy norms. Then they moved to a worse comparison population which also included the elderly. They applied mean minus SD to a heavily skewed dataset, which they claimed was a working age population and later had to correct this so-called "descriptive error" while apparently maintaining that there was nothing wrong about what they have done.

Despite overlapping with entry criteria for disabling fatigue and/or significant disability, they framed it as "getting back to normal lives" at the Lancet press conference, then acted as if they shared no responsibility for journalists routinely confusing normal for recovery. They approved the Lancet editorial, which made false claims about the normal threshold being a "strict" criterion for recovery based on a "healthy" population (wrong), clearly contradicted the authors' own previous claims made about 60-65/100 points being reflective of major impairments, and was later ruled by the PCC to be misleading.

The only thing "demographically representative" about the normative dataset used was the fact it is from the UK. Demographics can also involve other measures such as age and illness, which they purposely ignored. Then they claimed they abandoned the original threshold because <insert schoolboy error of confusing a mean score for a median score, how on Earth could a professional statistician let that slide when the Bowling et al paper even warns us of heavy skew???>

They also falsely claimed, with a false explanation despite White being a co-author of both papers, why their threshold was more conservative than previous work, when in fact it was less conservative. Although they previously admitted the normal range in physical function was post-hoc, this was omitted from the recovery paper, and now it even appears that this major change to the protocol as well as the definition of recovery which is based on it, was not even approved or scrutinized.

The Lancet journal published the error and flaw despite "endless rounds of peer review". The Psychological Medicine journal also failed to spot the multiple errors it published, and would not correct them even when pointed out e.g. actively refused to publish a letter outlining a major error in the reasoning for changing the threshold in the paper.

Meanwhile, PACE supporters insisted that the results were unquestionable, uber "rigorous", "robust", and "highest grade". Patients were derided for supposedly being too stupid to understand the research and for supposedly having strong conflicts of interest for being ill and opinionated, while blundering researchers with careers and reputations riding on the trial outcome were praised as being "utterly impartial" scientists. FOI requests for data promised by the authors themselves years ago or to help shed light on the changes are commonly dismissed or regarded as "vexatious". [edit to add Dolphin's observation: Letter writers to the Lancet were falsely accused of writing biased unscientific personal attacks against PACE, despite the letters making easily verifiable points and not discussing the authors on any personal level whatsoever.]

How long is the scandal going to continue before we are given some sort of fair concession? I'm just so sick and tired of the repeated blundering, the spin doctoring, the hole digging, the face saving, and the politics. Why can't those involved just admit mistakes have been made in the pursuit of this research, accept that CBT/GET has been hyped, and then help to make the most of the data that was collected? Our health is at stake here. A little frank honesty would have gone some way in healing the rift of mistrust that has developed between these researchers and a significant proportion of the ME/CFS community.

clusterfuck (plural clusterfucks)

(slang, chiefly military) A chaotic situation where everything seems to go wrong. It is often caused by incompetence, communication failure, or a complex environment.

http://en.wiktionary.org/wiki/clusterfuck

Snow Leopard · Nov 23, 2013

Yes, basically they picked post-hoc thresholds that would allow them to make claims that make their results look more impressive than they are. I'm not at all impressed that they continued with the lie despite the published letters in the Lancet showing the threshold was not valid.

Dolphin · Nov 23, 2013

Snow Leopard said:
Yes, basically they picked post-hoc thresholds that would allow them to make claims that make their results look more impressive than they are. I'm not at all impressed that they continued with the lie despite the published letters in the Lancet showing the threshold was not valid.

And the attitude of the Lancet editor could be said to have facilitated it all also i.e. by criticising letter-writers. Similarly piece by Lloyd and Van der Meer which did likewise.

kaffiend · Nov 23, 2013

biophile said:
"We will count a score of 75 (out of a maximum of 100) or more as indicating normal function, this score being one standard deviation below the mean score (90) for the UK working age population.[29]"

The SF-36 scores don't seem like something they would be normally distributed. Has anyone ever plotted the probability density function or done a test for normality? If it's not, parametric measures to define recovery such as standard deviation would be inappropriate without some transformation of the data.

'Recovery' from chronic fatigue syndrome after treatments given in the PACE trial

Senior Moment

Senior Member

Places I'd rather be.

Senior Member

Senior Member

Places I'd rather be.

Senior Member

Senior Moment

Places I'd rather be.

Senior Moment

Senior Member

Senior Moment

Senior Member

Places I'd rather be.

Senior Moment

Senior Member

Places I'd rather be.

Hibernating

Senior Member

Senior Member