• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of, and finding treatments for, complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia, long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

New paper on FINE in PLOS: "Therapist Effects and the Impact of Early Therapeutic Alliance"

joshualevy

Senior Member
Messages
158
A further problem with the Chalder fatigue questionnaire is illustrated by the observation that the bimodal score and Likert score of 10 participants moved in opposite directions at consecutive assessments i.e. one scoring system
showed improvement whilst the other showed deterioration.

Now that's a red flag.

It's not a red flag, it's exactly what would be expected from a measurement change that did not impact results.

Obviously, any change of measurement technique will cause the measurement to change. The meaningful questions are: do they change a lot (ie. enough to change the outcome of the study), and do they change enough to change the certainty of the study (ie. enough to change the p values reported in the study).

What you want if you change measurements is small changes up and small changes down, and overall no symmetric changes. So having some data points move up at one point and down at another (all changes small) is a good thing, not a bad thing, at least statistically speaking for a change in measurement.

Think about it this way: when you change measurements, some data point will change so little you can not even measure it. Those are said to not change. Other data points will move up (hopefully only a little), and others will move down. As a thought experiment, lets assume that 40% stay the same, 30% go up slightly, and 30% down slightly. If you compare two data points there is a 9% chance (30% of 30%) that one will go up and one down. In his post, Sam says that about 10/149 patients had this up-down behavior, and that is about 6.7%. My guess is they had even better stability: approximately 50% staying the same 25% up and 25% down.

If changing the Likert measurement changes the study, then you would see one of two things: a statistically significant change in one of the reported outcomes of the study, or a change in the p-values reported in the study. Neither of these things were seen.
 

Esther12

Senior Member
Messages
13,774
It's not a red flag, it's exactly what would be expected from a measurement change that did not impact results.

But QMUL have claimed that the change from Bimodal to Likert is like changing from inches to centimeter. This is another illustration of how misleading that is.

Obviously, any change of measurement technique will cause the measurement to change. The meaningful questions are: do they change a lot (ie. enough to change the outcome of the study), and do they change enough to change the certainty of the study (ie. enough to change the p values reported in the study).

For FINE, changing to Likert allowed them to claim a significant improvement in fatigue (although the recent Cochrnae review seems to question this, and I'm not sure who is right about that). For PACE, we do not know if things would change enough to alter the outcome of the study.
 

joshualevy

Senior Member
Messages
158
For FINE, changing to Likert allowed them to claim a significant improvement in fatigue

What data supports this? Are you claiming that (unpublished) bimodel measurement did not show significant improvement, while (published) likert did? If so, where are the numbers (for both the treated and untreated groups, so we can see the difference)?
 

Esther12

Senior Member
Messages
13,774
What data supports this? Are you claiming that (unpublished) bimodel measurement did not show significant improvement, while (published) likert did? If so, where are the numbers (for both the treated and untreated groups, so we can see the difference)?

Bimodal scoring was their prespecified primary outcome, and was reported:

At one year after finishing treatment (70 weeks), there were no statistically significant differences in fatigue or physical functioning between patients allocated to pragmatic rehabilitation and those on treatment as usual (-1.00, 95% CI -2.10 to +0.11; P=0.076 and +2.57, 95% CI 3.90 to +9.03; P=0.435).

http://www.bmj.com/content/340/bmj.c1777?rss=1

They've since been trying to use likert scoring to claim a positive result, but a recent Cochrane publication claimed that there was no significant positive effect for likert scoring either (I've not looked into the details of this discrepency).

Here's the RR where they first released likert scores: http://www.bmj.com/rapid-response/2011/11/02/fatigue-scale-0

The recent Cochrane CFS exercise therapy update includes a response to Tom Kindlon's comments which claims that even using likert scoring there was no positive result for FINE.
 

user9876

Senior Member
Messages
4,556
It's not a red flag, it's exactly what would be expected from a measurement change that did not impact results.

Obviously, any change of measurement technique will cause the measurement to change. The meaningful questions are: do they change a lot (ie. enough to change the outcome of the study), and do they change enough to change the certainty of the study (ie. enough to change the p values reported in the study).

What you want if you change measurements is small changes up and small changes down, and overall no symmetric changes. So having some data points move up at one point and down at another (all changes small) is a good thing, not a bad thing, at least statistically speaking for a change in measurement.

Think about it this way: when you change measurements, some data point will change so little you can not even measure it. Those are said to not change. Other data points will move up (hopefully only a little), and others will move down. As a thought experiment, lets assume that 40% stay the same, 30% go up slightly, and 30% down slightly. If you compare two data points there is a 9% chance (30% of 30%) that one will go up and one down. In his post, Sam says that about 10/149 patients had this up-down behavior, and that is about 6.7%. My guess is they had even better stability: approximately 50% staying the same 25% up and 25% down.

If changing the Likert measurement changes the study, then you would see one of two things: a statistically significant change in one of the reported outcomes of the study, or a change in the p-values reported in the study. Neither of these things were seen.

It should be a red flag.

The way statistics are reported using the mean and std in both cases suggests that both versions are believed to be interval scales as this is the only way a mean and SD should be used. This means that they are assuming that the sum of answers from a the set of questions represents a linear proxy for the 'concept of fatigue'. It would also mean that they are expecting the 'likert' measurements to represent additional degrees of accuracy (which is what PACE claimed as they changed their protocol) a bit like measuring in millimeters rather than centimeters. However, the data suggests that this is not the case - although we don't need the data to say this it is obvious from a structural analysis of the different scoring mechanisms. What the data says is that the two scales are different.

The consequence of this is that one or the other of both cannot be interval scale proxies that measure 'fatigue'. So it should be a red flag over the measurement systems used in the trial and the way results are quoted. Since the two marking schemes don't preserve ordering between the two schemes it suggests that one or the other is simply a nominal scale and only the mode should be quoted (but which?)

Actually given the different weightings of 'mental fatigue' and 'physical fatigue' that seem to exist in the structure of the questionnaire (i.e. represented by a PCA of the data) their is an implied utility in the scale where physical fatigue is considered more important than mental fatigue.

In terms of p values it is important that those doing significance tests understand the consequences of the form of the data and how they match the assumptions made on deriving the significance tests.
 

Esther12

Senior Member
Messages
13,774
I can't remember. They've referred to FINE as showing a positive result in presentations, but I think that most of their papers are vague/evasive on this.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
The discrepancy between the Cochrane and the BMJ rapid response outcomes, may be because the BMJ comment was an informal publication and not audited?

This is what the two publications have for Likert at 70 weeks, for pragmatic rehabilitation vs treatment as usual.

Likert at 70 weeks

Cochrane. [Analysis 1.2]
A non-significant effect.
Mean difference:
-2.12 [-4.49, 0.25]

BMJ Rapid Response. [ http://www.bmj.com/rapid-response/2011/11/02/fatigue-scale-0 ]
A significant effect.
Effect estimate:
-2.55 [-4.99,-0.11]
 
Last edited:

Dolphin

Senior Member
Messages
17,567
FINE Trial team remove raw data file: pressure from PACE Trial team?
http://forums.phoenixrising.me/inde...ssure-from-pace-trial-team.44705/#post-726394
New thread


http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0157199

Correction: Therapist Effects and the Impact of Early Therapeutic Alliance on Symptomatic Outcome in Chronic Fatigue Syndrome
  • Lucy P. Goldsmith,
  • Graham Dunn,
  • Richard P. Bentall,
  • Shôn W. Lewis,
  • Alison J. Wearden
logo.plos.95.png



http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0157199

The dataset originally included as S1 Dataset was removed in consideration of possible restrictions for the public availability of the data related to the wording of the original consent form for the trial. Upon consultation with the authors’ university it has been established that the file may be publicly shared as it reports de-identified data. Please view S1 Dataset here.

The Data Availability statement for the article is revised to read: The authors have prepared a dataset that fulfills requirements in terms of anonymity and confidentiality of trial participants, and which contains only those variables which are relevant to the present study. Data are available as Supporting Information.

S1 Dataset. De-identified trial data.[/paste:font]
doi:10.1371/journal.pone.0157199.s001

(DTA)

References
  1. 1. Goldsmith LP, Dunn G, Bentall RP, Lewis SW, Wearden AJ (2015) Therapist Effects and the Impact of Early Therapeutic Alliance on Symptomatic Outcome in Chronic Fatigue Syndrome. PLoS ONE 10(12): e0144623. doi:10.1371/journal.pone.0144623. pmid:26657793
  2. 2. Goldsmith LP, Dunn G, Bentall RP, Lewis SW, Wearden AJ (2016) Correction: Therapist Effects and the Impact of Early Therapeutic Alliance on Symptomatic Outcome in Chronic Fatigue Syndrome. PLoS ONE 11(5): e0156120. doi:10.1371/journal.pone.0156120. pmid:27191956
Citation: Goldsmith LP, Dunn G, Bentall RP, Lewis SW, Wearden AJ (2016) Correction: Therapist Effects and the Impact of Early Therapeutic Alliance on Symptomatic Outcome in Chronic Fatigue Syndrome. PLoS ONE 11(6): e0157199. doi:10.1371/journal.pone.0157199