1. Patients launch $1.27 million crowdfunding campaign for ME/CFS gut microbiome study.
    Check out the website, Facebook and Twitter. Join in donate and spread the word!
Dr. Kerr, I presume?
Clark Ellis brings us a rare interview with British researcher Dr. Jonathan Kerr who is now living in Colombia.
Discuss the article on the Forums.

PACE Trial statistical analysis plan

Discussion in 'Latest ME/CFS Research' started by biophile, Nov 16, 2013.

  1. biophile

    biophile Places I'd rather be.

    Messages:
    1,387
    Likes:
    4,585
    A randomised trial of adaptive pacing therapy, cognitive behaviour therapy, graded exercise, and specialist medical care for chronic fatigue syndrome (PACE): statistical analysis plan.

    Walwyn R, Potts L, McCrone P, Johnson AL, Decesare JC, Baber H, Goldsmith K, Sharpe M, Chalder T, White PD.

    Trials. 2013 Nov 13;14(1):386. [Epub ahead of print]

    doi:10.1186/1745-6215-14-386

    Abstract (provisional)

    BACKGROUND: The publication of protocols by medical journals is increasingly becoming an accepted means for promoting good quality research and maximising transparency. Recently, Finfer and Bellomo have suggested the publication of statistical analysis plans (SAPs).The aim of this paper is to make public and to report in detail the planned analyses that were approved by the Trial Steering Committee in May 2010 for the principal papers of the PACE (Pacing, graded Activity, and Cognitive behaviour therapy: a randomised Evaluation) trial, a treatment trial for chronic fatigue syndrome. It illustrates planned analyses of a complex intervention trial that allows for the impact of clustering by care providers, where multiple care-providers are present for each patient in some but not all arms of the trial.

    RESULTS: The trial design, objectives and data collection are reported. Considerations relating to blinding, samples, adherence to the protocol, stratification, centre and other clustering effects, missing data, multiplicity and compliance are described. Descriptive, interim and final analyses of the primary and secondary outcomes are then outlined.

    CONCLUSIONS: This SAP maximises transparency, providing a record of all planned analyses, and it may be a resource for those who are developing SAPs, acting as an illustrative example for teaching and methodological research. It is not the sum of the statistical analysis sections of the principal papers, being completed well before individual papers were drafted.Trial registration: ISRCTN54285094 assigned 22 May 2003; First participant was randomised on 18 March 2005.

    PMID: 24225069 ( http://www.ncbi.nlm.nih.gov/pubmed/24225069 )

    http://www.trialsjournal.com/content/14/1/386/abstract

    (full text) http://www.trialsjournal.com/content/pdf/1745-6215-14-386.pdf
    Last edited: Nov 16, 2013
    Esther12, Valentijn and Snow Leopard like this.
  2. biophile

    biophile Places I'd rather be.

    Messages:
    1,387
    Likes:
    4,585
    This is 45 pages long, so I will probably not have the opportunity to review it fully anytime soon and I am in no urgent rush to do so, but I did have a very brief search to see if they explain themselves further about some of the most controversial changes. Despite talk of "maximising transparency" and this following statement in the 2011 Lancet paper ...

    "The statistical analysis plan was finalised, including changes to the original protocol, and was approved by the trial steering committee and the data monitoring and ethics committee before outcome data were examined."

    I could not find any details about the normal range in fatigue and physical function in the statistical analysis plan, nor on the revised recovery criteria. Wasn't the statistical analysis plan supposed to explain everything? Can anyone else find it? Maybe it really was post-hoc in the sense that it was tacked on after even the 'final' changes were made?

    As many of you already know, the normal range, in physical function in particular, was derived using questionable methods on inappropriate dataset(s), and the justification for the change was based on a schoolboy error in interpreting statistical data (i.e. confusing the mean score for the median score, which is inexcusable when considering the amount of professional scrutiny that PACE supposedly received before approval and publication).

    I was hoping to see further details in the statistical analysis plan about this, but I guess now we will have to wait until their blunder is publicly exposed enough that PACE are forced to explain themselves in writing.

    For those of you who have criticized the rather small thresholds for clinical improvement, which were lowered dramatically from the original protocol, I did find this gem though:

    Based on PACE data too, the MCID for fatigue would be 1 single point out of 33! (3.8*0.3=1.14). [Edit: Perhaps it would be rounded up to 2, since it is not possible to score 1.14, and 1 would not technically meet MCID]. Also note that when the range is 0-100, the SF-36/PF scale is in 5 point increments, so 5 points for MCID is effectively a single point out of 20, the smallest change possible, just as 1 point is for the CFQ Likert scoring (0-33), a far cry from the original goalposts. Excluding the severely affected but having strict cut off points had the effect of lowering the baseline SD.

    And this for those who are interested in the distribution of the scores:

    Last edited: Nov 16, 2013
  3. Dolphin

    Dolphin Senior Member

    Messages:
    6,718
    Likes:
    5,565
    This journal allows online comments. The threshold for online comments is a lot lower than published letters so hopefully some people will discuss some of the issues as comments.
    Valentijn likes this.
  4. Dolphin

    Dolphin Senior Member

    Messages:
    6,718
    Likes:
    5,565
    Thanks biophile

    Small point: I'm not an expert on MCIDs and the like but there is a chance one might not round down i.e. the MCID might be >=1.14 which is 2. Not sure on this as I say.
    biophile likes this.
  5. Sea

    Sea Senior Member

    Messages:
    723
    Likes:
    834
    NSW Australia
    So an improvement of 8 points on the SF-36 is clinically useful, but a serious deterioration is defined as a decline of at least 20 points at two consecutive assessments

    "Safety outcomes are:

    1. Serious deterioration (primary) defined as one or more of the following up to 52 weeks:

    1. SF-36 physical function score diminishing by 20 or more points between baseline and

      any two consecutive assessment interviews."
    Last edited: Nov 16, 2013
    Esther12, Valentijn, Dolphin and 2 others like this.
  6. anciendaze

    anciendaze Senior Member

    Messages:
    871
    Likes:
    937
    Simple observation: moving the threshold by 5 points is not a significant change because the threshold is an arbitrary number; crossing the threshold by 5 points still constitutes recovery, an important goal of this research. Of course using original recovery criteria on those who entered the trial at the altered threshold would be wrong. Requests for data allowing anyone else to check that this did not happen are "vexatious".

    Anyone see an internal contradiction in this argument?

    I think part of the criticism of the actual numbers is irrelevant. The important point is the interpretation casual professional readers are likely to place on PACE data. If the population distribution were normal, being 1 SD from the mean would not be a serious problem. This could also be applied to patients with heart failure or COPD who fall in this same range. The only difference here is that "we know" these people have real medical problems, while "we know" ME/CFS patients have somatoform mental disorders.

    Likewise, ignoring age in comparing sample and population leads to an implication I have not seen mentioned. If we only go by medical judgment of physical condition there is no reason a 70 year-old in good physical condition should be receiving a pension. His/her performance on objective measures of physical function like that 6 minute walk could well be twice the score of many ME/CFS patients.

    The important confusion was in the study from the outset, ignoring the difference between the healthy and the ill, the young and the aged. Numbers are far less important here than perceptions. The extent to which those responsible for the trial have defended methods, maintained claims, and even extended them to define recovery, without clarifying important points tends to imply that deception was deliberate.

    To the extent that the trial was rigorous it was virtually meaningless. Moving goalposts by 5 points also implies that simply crossing such an arbitrary boundary does not constitute meaningful recovery. No stronger criteria have been suggested, nor has evidence been presented which might show the trial met these. Public relations and professional perceptions seem to have been the main target. Scientific methods were used to add a patina of objectivity to a fundamentally subjective exercise in which those running the trial tried hard to transfer their own biases to patients, largely failing in the process.

    While organizers may be prevented from disclosing private medical information about patients, the patients themselves remain free to report their participation in the trial, if they so choose. With hundreds involved we might expect to hear testimonials about "how PACE returned me to normal life".

    Am I missing something?
  7. user9876

    user9876 Senior Member

    Messages:
    759
    Likes:
    1,825
    I'm pretty sure that this hasn't been done. They also say the expect them to be normal which I think has an interesting implication that they expect all patients to act in the same way. I would have thought that having a treatment effective/treatment uneffective group would lead to a bimodal distribution.

    I've only skimmed the document but they don't seem to justify the use of the 'likert' scoring on the CFQ. The paper they quote tried to validate the bimodal scoring which suggests to me that there has been no attempt to validate the scale using the 'likert' scoring. Its also worth noting that in the paper they quote they say something along the lines of the need to quote mental and physical fatigue separately due to the variance being in 2 major principle components.
  8. Esther12

    Esther12 Senior Member

    Messages:
    5,268
    Likes:
    5,478
    That stood out to me too.

    No data on 'serious' improvement, or 'clinically important' deterioration either.
  9. Dolphin

    Dolphin Senior Member

    Messages:
    6,718
    Likes:
    5,565
  10. Dolphin

    Dolphin Senior Member

    Messages:
    6,718
    Likes:
    5,565
    • This file (11.68MB) has the wording of the various questionnaires used: http://evaluatingpace.phoenixrising.me/PACE_Protocol.pdf . It has a lot of useful information to better get to grips with the PACE trial and to find information e.g. figure on pages 52-54 shows the schedule of questionnaires used.
    Valentijn likes this.
  11. Simon

    Simon

    Messages:
    1,356
    Likes:
    4,292
    Monmouth, UK
    They said the would present more than that, and for secondary as well as primary outcomes:
    However, this is a stats analysis plan, and as the first step of any stats analysis is to plot out the data they may simply mean 'present' as in produce the plots for internal use as part of the analysis, as opposed to 'publish'. If so, I would have thought that was the just kind of information that would be swiftly released in response to an FOI request, if anyone was interested in doing so.
    Esther12, Dolphin and Valentijn like this.
  12. Valentijn

    Valentijn Activity Level: 3

    Messages:
    6,335
    Likes:
    9,097
    Amersfoort, Netherlands
    One criticism from the instructor of the Coursera statistics course is for when studies don't include histograms. A lot of false trends get outed just by looking at the distribution of the pretty dots.
  13. Simon

    Simon

    Messages:
    1,356
    Likes:
    4,292
    Monmouth, UK
    Basic commentary on the Stats Analysis Plan

    Mmm, this was a good read. Thought I'd post up a few of my notes on this.

    The good
    First off, they published the plan, which is rare in clinical trials. Second, they did some very sophisticated analysis to take account of variables that might affect results. So when looking at the differences between the different groups they adjusted for other factors including treatment centre, criteria (Oxford, CDC and 'London') as well as adjusting for the baseline score (ie taking in to account how disabled/fatigued someone was at the start of the trial):
    I won't go into the detail (because I don't understand it all :)) but here's an example of the kind of fancy stuff they did:
    You get the idea.

    Transparency issues
    The paper makes many excellent points about the value of transparency:
    They neglect to mention that most of these noble ideals were not met in this case as the plan was published 3.5 years after it was agreed and 2.5 years after the main paper it relates to came out.

    'Recovery' statistical analysis not included
    Unfortunately, the highly-controversial PACE redefinition of Recovery was not covered by this plan, and instead seems to fall under 'exploratory' analysis: defining Recovery should never be exploratory; it really isn't that difficult and you shouldn't have to look at your results before you can know what 'recovery' is:
    Think I'll take a break at this stage.
    biophile, Esther12 and Dolphin like this.
  14. Simon

    Simon

    Messages:
    1,356
    Likes:
    4,292
    Monmouth, UK
    Each post even more exciting than the last! Brace yourselves...

    Another good thing laid out in the stats plan was the intention to check that the changes seen in the primary outcomes of fatigue and self-rated function were matched by changes in secondary outcomes such as depression and the 6-minute walking test.

    The plan says all of these secondary outcomes will be run through the main analysis used for the primary outcomes (tweaked as necessary due to different types of data - continuous or categorical).
    Which is very thorough. Except they don't seem to have done it.

    Going back to the objective above:
    3. Are the differences across interventions in the primary outcomes associated with similar differences in secondary outcomes?
    We know that for the sole objective outcome reported, the 6-minute Walking Test, the differece were not the same: CBT showed a 'moderate' gain in fatigue and function but no gain in walking distance relative to the control group. And GET showed a moderate gain in fatigue/function but only a small gain in walking distance.

    Given that it was a stated objective to look at whether or not each of the secondary outcomes backed up the primary outcomes, I'm surprised it doesn't seem to have been reported in any of the papers. And they are not just talking about a quick look at the basic figures (as I did above) but detailed statistical models. Where are they? I can only hope someone will show me what I've missed.
  15. Dolphin

    Dolphin Senior Member

    Messages:
    6,718
    Likes:
    5,565
    What a lot of people in their published and unpublished letters (and comments in other fora) have done is contrasted some of the secondary results with the primary outcome measures (and figures for recovery, etc.). And we have been met with quite a bit of scorn for doing something they seemed to have planned to do.

    It is frustrating they didn't publish the statistical plan on time. As they say themselves (as Simon highlighted):
  16. Simon

    Simon

    Messages:
    1,356
    Likes:
    4,292
    Monmouth, UK
    Could the disappointing overall results have been known, at least by statisticians, before unblinding?
    The PACE Trial protocol assumed large effects for CBT & GET on outcomes, and even results coded to maintain blinding would clearly show that NO groups had done very well and that the trial wasn't hitting its original targets. So if these reports for the DMC were prepared before the stats plan was finalised, as this implies - and if the reports included outcomes (which I assume they did) then descriptive stats, which include means, would show the trial was struggling. That knowledge could potentially have influenced the decision to change how primary outcomes were reported.

    This really hangs on a) preparation of blinded reports before the stats plan was finalise and b) outcomes being included in the blinded reports (blinded reports simply don't show which group is which).
    Sea, biophile and Dolphin like this.
  17. Esther12

    Esther12 Senior Member

    Messages:
    5,268
    Likes:
    5,478
    This may have had an impact upon the way in which new outcome measures were developed, but it was non-blinded treatment - anyone with any contact with patients would have known that CBT/GET were not resulting in the improvements claimed for them, even if they did not have access to the unblinded data.
    biophile likes this.
  18. biophile

    biophile Places I'd rather be.

    Messages:
    1,387
    Likes:
    4,585
    Good work Simon! All their talk of timely transparency and scrutiny and comparing outcome measures was amusing. Were the TSC at the local pub when PACE came up the 'normal range' and 'recovery'? Their justification for changing the threshold of normal/recovered has already been exposed as dodgy, and so has some of their other claims, but now it also appears that their statistical analysis plan contradicts what they have done and claimed in later papers too.

    Indeed. I will bet everything I own that the so-called "recovered" participants' scores in physical function have a clearly different distribution compared to healthy age matched controls.
    Last edited: Nov 22, 2013
    Sea, Valentijn and Dolphin like this.
  19. Dolphin

    Dolphin Senior Member

    Messages:
    6,718
    Likes:
    5,565
    Given the trial set out to be a definitive trial, all the more reason to pour over it, and not "move on" as Sean Lynch calls us to.
    Valentijn, biophile, Sean and 3 others like this.
  20. Dolphin

    Dolphin Senior Member

    Messages:
    6,718
    Likes:
    5,565
    Looking back quickly at McCrone et al. (2012) http://www.plosone.org/article/info:doi/10.1371/journal.pone.0040808 , I think virtually none of this was done.

    There doesn't seem to be much/anything reported for 24 weeks.

    Nor do I see anything on patient characteristics etc. (i.e. 3, 5 & 6).
    Esther12 likes this.

See more popular forum discussions.

Share This Page