• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of and finding treatments for complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia (FM), long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

BMJ comments on new PACE trial data analysis

Sidereal

Senior Member
Messages
4,856
They're all ME/CFS "specialists", so were working at CBT/GET clinics. The care from all would be pretty similar, in that it's a typical lack of care with the possible exception of pain or sleep meds.

Specialism doesn't matter at all in this context, and is actually often just used to mislead patients. "See, it's real treatment being offered - it's from an immunologist!" But they don't mention that the immunologist, or neurologist, or other type of physician whole-heartedly supports the psychological theories and treatments for ME/CFS.

I'm not saying that the care from infectious disease people would have been any more biological than the psychiatrists or GPs only that a hodgepodge of people with such different backgrounds are bound to differ in how they view the illness and their attitudes towards patients. If we're assuming that CBT/GET worked via placebo effect then contemptuous SMC received from "specialists" would have enhanced the difference between the groups.
 

user9876

Senior Member
Messages
4,556
Sorry I think this is a very rambling set of comments.

It is all to do with making sure people do not cherry pick data when the effect is around the limit of noise level. If ME symptoms fluctuate in time and also shift from one thing to another then a real benefit could be masked by chance shifts in symptoms. There is no 'guessing the right measure' really. You might be guessing which measure shows up because by luck there is less noise for that one but it could have been a different measure. All this really comes out in the wash if the effect is big enough to stick out above the noise. Then it does not really matter which measure you pick - they should all show an effect.

Cherry picking does seem to be a problem but to me a better solution than having a single end point is to enforce the publication of all data and require justifications for measures that contradict. Being paranoid I would actually insist that data is lodged in a third party service as it is collected along with provenance information and then published when the end point of the trial is reached - perhaps along with any predefined analyses and QA testing results. If there are privacy concerns there are some differential privacy techniques for perturbing data but keeping the distributions and hence certain analyses valid.

I tend to think that its not just that different measures should show an effect but that the effects should correlate across a patient.


The other difficulty is that although it might seem there should be one hypothesis beforehand in practice there will be many. Fluge and Mella were testing the hypothesis that rituximab helped ME symptoms. They discovered that it might have been better to ask if it helped ME symptoms with the time course typical of rituximab - which is a bit unexpected because it is delayed.

This is where I think carefully pre-stating possible hypothesizes becomes important. For example, if you could have a hypothesis that rituximab works by removing EBV hence certain measures made at certain times and a second hypothesis that rituximab works by removing antibodies hence additional measures made at certain times. There may be other sub-hypothesises looking at different systems that could be effected by antibodies.

Shouldn't the measurement strategy come from the set of hypothesises that are of interest to look for supporting or non-supporting evidence. Of course it could be that patients get better but there is not supporting evidence for any hypothesis which would presumably demonstrate a glaring gap in our knowledge (or the experiment design).

One thing that surprises me around the publication of protocols and results papers afterwards is the lack of others analyzing and criticizing the approach and results. I'm sure it happens in good research groups but I don't see much publication. I have worked in a computer security field where academics get great kudos for breaking or finding fault in protocols and actually the more interesting work involves developing new mathematical techniques for the analysis. But that is a very simple situation. It seems to me that the initial setting out of the theory and how that ties to measurement strategies having others comment, validate and apply analytic tools would help in refining experiments/trials. Such things only happen if funding and credit are given to academics doing the analysis and these days I guess that means things counted in the university research assessment exercise.

I am also thinking there is a difference between say a stage 3 trial where the effects and mechanism have already been explored and more exploratory trials.

I quite see your puzzlement, and the reality is that your Bayesian approach can be, and is, applied post hoc to influence how seriously people take a result. But if you cannot pick a best endpoint and get a statistically significant result on it, all things being equal, it is a fair guide to the result being inconclusive at best - and needing repeating if taken seriously at all.

I guess this is my real problem. I don't really understand significance testing - it seems counter intuitive to me when looking through the equations. I also worry about the various assumptions being made around the distributions of data and the distributions of noise. Too often noise is assumed be be Gaussian and independent which often doesn't seem a good assumption. I worry that particularly with fluctuation conditions that significance testing may not be sound - as I understand it the theory looks at having a random variable with given but potentially unknown distributions but does it work if there is a dynamic process behind the random variable and hence serial correlation where the dynamic process would be a function of both treatment and other environmental effects (which perhaps we could model as a series of stochastic variables). Perhaps I should construct a series of simulations to test my concerns.

I guess in terms of end points I worry with PACE they have found that there exists and end point that shows a significant difference but they have not shown that for all end points measured that there is a significant difference. As you say if you cannot pick an end point and show a significant difference then the result is inconclusive. However I think I would argue that if you can pick a measure where the result is not significant but your hypothesis suggests it should be then the results should be inconclusive. If you don't have an adequate hypothesis to make predictions (i.e. just a single blackbox yes/no test) then it seems very hard to interpret the existence of a null result and so I would again say that the results are inconclusive.