• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of, and finding treatments for, complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia, long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

Measures of outcome for trials and other studies

eafw

Senior Member
Messages
936
Location
UK
I think that there may be important logical reasons why the 'threshold' maths of ACR grades is more what is needed. What I am not sure is why this should be the right maths.

Possibly because of the nature of the illness, that we have several systems gone wrong at once, and interplaying, so clearing more than one threshold is necessary to restore functioning in practice.

Example, if a person has muscle weakness and sensory problems and issues with balance then they will have difficulty walking down the street. So even if you get a big improvement in one of those (say the muscles) then low scores on the other two aspects are still sufficient to affect functioning and cause disabilty. That is, a 10% improvement on each will give a better outcome in practical terms than a 30% increase on just one and this will likely map better to the patient saying they feel better and actually being able to do more.
 

eafw

Senior Member
Messages
936
Location
UK
Couple of other thoughts, ideally any test would be no more demanding than a person's usual activity levels so as not to worsen the situation and tests need to be done at both a "bad" and "good" time of day.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
Thanks for the comments, @Bob. I have been thinking along the lines of some sort of threshold based composite score and I still think a 3 component score makes sense. So taking your points into account I am leaning towards something like this at the moment.

1. Does the person actually think they are better, all things considered. In RA all we do here is ask for a global self-rating in relation to a mark on a 10cm line at the beginning. If you start at 2, are you now around 5.5? There are no issues about whether the emphasis is wrong for the individual - they emphasise as they like. Beyond that there are no questionnaires, maybe because they run into all the problems discussed.

2. Is there physiological evidence to support the person's claim that they are better? (To reduce the chance they are just saying they are better to please you, and other stuff.) Tilt table seems to have its pros and cons. Maybe thinking tests should go in here (I hate 'cognitive').

3. Is there a change in real life ongoing activities of daily living that supports the claim of being better? Actometry/ actigraphy seems to have a lot going for it here although it may need to be based on an iWatch7 app not yet invented to be sensitive and flexible enough.

If there is 50% improvement on all three then I think we begin to have an indication that a treatment can be seriously useful. And not all treatments need be judged in this way since things like analgesics do not aim to alter the overall process. But a 'real treatment for CFS' to quote a recent writer on the subject ought I think to hit the mark at least now and again.
I was involved, as a subject, in ongoing research dating back to 1993 that used exactly this kind of thing. It was never published in full, only pieces of some theoretical ideas were published, with some preliminary observations, at least to my knowledge. Not every paper on the topic was available for me to read. It ended due to official opposition, lack of funding, and lack of interest.

The objective measures were vascular and metabolic. Peripheral blood flow was measured, as was metabolic rate. There was some evidence that blood flow could be directly observed in the eyes, and this was beginning to be investigated. Autonomic changes to blood flow in the eyes may be visually identified by an experienced doctor.

Had numerous attempts at funding not been denied this might have got somewhere from 22 years ago.

If anyone is not familiar with my history, this was the work of Dr Andriya Martinovic, Brisbane, Queensland, Australia. I was one of his patients and a test subject. I also provided some input on heuristic methods (hill climbing approaches in treatment) and omega 3 metabolism. Dr Martinovic spent a lot of time and his own money working on this.
 
Last edited:

WillowJ

คภภเє ɠรค๓թєl
Messages
4,940
Location
WA, USA
There is an important false argument here, however, shown by the anti-IL-6 case. If we know treatment X tends to increase Y (some cellular or physiological measure) then measuring Y is no help as evidence that X is causing improvement via Y. You need to measure some Z that you thin kis on the way between Y and feeling better. So measuring NK cell function after ampligen is not helpful if we know that ampligen increases NK function. To increase confidence that patients are better because of this one would need to measure something that we think links NK function to feeling well in the context of ME but not in someone without ME. The main advantage of measuring NK function would in fact be to show that it might not correlate and that ampligen might be working some other way.

Ok, I see. Like cholesterol medicine? All cholesterol medicine reduces blood cholesterol levels, but not all of it reduces risk of cardiovascular events (the desired effect of using the drug).

That's a good point to keep in mind here. :)
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
Adding to my previous comment, Dr Martinovic created a detailed symptom checklist which in retrospect seems remarkably like a checklist of autonomic, metabolic, vascular and neurologic issues in clusters. Patients rated their response both in absolute terms and in relative terms, or in other words about current status and changes in status.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
Ok, I see. Like cholesterol medicine? All cholesterol medicine reduces blood cholesterol levels, but not all of it reduces risk of cardiovascular events (the desired effect of using the drug).

That's a good point to keep in mind here. :)
This is a huge problem in medicine. Objective endpoint markers were chosen and used in place of final outcomes in so many studies. Overall mortality and morbidity are the final outcomes that matter, not serum cholesterol, or indeed any other one marker. Substitute markers for the big outcomes are in danger of putting undue emphasis on the wrong things. That is why it was missed for so long that sugar is as dangerous in the diet as it is, and why cholesterol was demonized, why we think folic acid is wonderful, and why we missed that saturated fat protects the heart. Large scale epidemiology is just as important as focused studies. These issues are not limited to medicine though. For a subject area that really stuffs up on this, look at economics.
 

Snow Leopard

Hibernating
Messages
5,902
Location
South Australia
Since a Bayesian approach was mentioned, I am wondering about what this means in terms of prior probabilities of improvement and personalisation of measures. Eg in terms of improvement, someone with well below activity level, would be expected to return to the group/population mean, if the treatment was highly successful for example. But if on another measure, say some measure of OI, or heart rate variability, or sleep issues etc were more or less normal before the treatment, then requiring some degree of improvement on this measure to achieve some sort of improvement grade would not make sense.

I am starting to see the usefulness of the composite measure, that doesn't intend to measure improvement on each and every measure and claim an improvement = the treatment works. The key being achieving a certain grade across all measures relative to a norm perhaps? - so zero improvement might be necessary on a particular measure, if the result was already acceptable on that measure. But a very large improvement on another measure might be necessary to achieve a particular grade, since the result on that measure was well below average.

The only problem with this approach is avoiding the statistical noise already discussed, due to day to day variation. Sampling over time is possible, but obviously places additional burdens on the participants.

Maybe the key thing is to move focus away from complex questionnaires.

Which actually brings up a key point. "Questionnaire fatigue". I don't mean in an illness sense, but in a psychological sense where people get overwhelmed with excessively long lists of questions and answer less accurately as a result.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
I think it is possible to have subjective and quantitative measures which can be very useful - even without being blinded. Particularly if you are looking at improvement per individual.
I think this relates what you are trying to do. Markers for biomedical diagnostic or pathophysiological studies may have a different focus than markers for rehabilitative therapeutic studies. Outcomes for interventions may have different focus. Drug therapy outcome measures are not the same as coping strategy measures. Measures used in basic research might be different from measures used for assessing disability.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
I would like to add a comment about timing of tests, to add to what has already been said.

I have a difficult time with doctors who think that tests need to be fasting and early morning, for example. They cannot comprehend I have no morning, or that sometimes I do not sleep at all. Timing of tests is critical for some tests. Yet what does this mean if you are supposed to have an overnight fast, but that overnight is the only time you are awake? And you are diabetic? I argue with new doctors on these issues far too often.

Standardized early morning or post prandial or whatever kind of test you want is highly problematic in a disease with multiple shifts in circadian patterns, both in terms of chemistry (e.g. cortisol) and behaviour (e.g. sleep). My early morning tests might be the middle of my sleep time, or the middle of the day, or in the middle of several days without any sleep at all, or even far too much sleep.
 

Snow Leopard

Hibernating
Messages
5,902
Location
South Australia
A side issue, the debate about 'successful' blinding in double blinded studies.

I don't believe that a simple yes/no/unsure question about whether the patient believes they had the active treatment at the end of the trial is sufficient.

The fact is that patients who improve are naturally going to think they received the active treatment. If the treatment is successful in most patients, then most patients are going to indicate that they received the active treatment and we have no idea about the directionality, or whether the blinding was successful.

I was thinking more along the lines of:

A yes/no (no option for 'unsure), along with an indication of confidence in this answer.

It would be expected that given this question at the start of the trial, due to the human bias towards optimism, more patients would expect to be in the active treatment arm.

As a followup, an indication of why the patient believes they are in a particular arm, eg if they received significant side effects, or whether it was due to improvement or gut feeling or another indicator.
 

WillowJ

คภภเє ɠรค๓թєl
Messages
4,940
Location
WA, USA
I read that apple were going to add a heart monitor to their smart watch but didn't in the end because it didn't meet medical standards. This made me wonder about the accuracy of small devices but that may not matter for the type of readings necessary for this type of monitoring.

I read that lots of them were not accurate, but it seemed like some were, so seems possible.

Do we need to differentiate remission from recovery?
Yes, I think it would be better to acknowledge that remission is something that happens. Recover could either not really be used (as it's not much in other diseases where remission is known) or held to a very high standard.

I think the Bell scale needs operationalising in terms of activities. Hard to say what 30% of normal is, when one person's normal might have been running marathons and another's being a couch potato. And is it 30% of energy, 30% of hours awake being able to do moderate stuff... I don't know what it means.
Also the Bell scale lumps unlike things together. For me, physical and mental ability don't always travel together tightly.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
Also the Bell scale lumps unlike things together. For me, physical and mental ability don't always travel together tightly.
My cognitive and physical ups and downs do not match. There can be lock step changes, usually from PEM, but they can move separately. In the mid 90s most of my big issues were physical, but the cognitive aspects were rising. A decade later my physical aspects were in decline, but a slow attrition of cognitive capacity was underway.

Its also the case that physical and cognitive capacities are multidimensional.
 

Kyla

ᴀɴɴɪᴇ ɢꜱᴀᴍᴩᴇʟ
Messages
721
Location
Canada
Some interesting ideas here:
http://solvecfs.org/deciphering-post-exertional-malaise/


"Current clinical descriptions and case definition criteria of PEM are not currently quantified well-enough to be used in the laboratory setting. As a result, most studies have had to operationally define PEM. This has led to PEM being described and defined in a host of different ways including:

  • Increased symptoms of pain & fatigue
  • Reduced physical activity levels
  • Abnormal exercise responses
  • Changes in cognitive function
  • Circadian rhythm/period changes
  • Sleep disruption
  • Impaired pain regulation (central nervous system sensitivity)
  • Various biological markers (e.g. complement C4a, cytokines, natural killer cells, NF-кB, oxidative stress, gene expression, etc.)
Too many studies that have defined PEM based on biological outcomes have failed to measure symptoms. This is problematic because unless you demonstrate that biology is related to illness severity, you cannot attribute any changes in biology to the phenomenon of interest (in this case PEM). "


 

beaker

ME/cfs 1986
Messages
773
Location
USA
Here is one of the confounds on cognitive testing. I have been writing a ton of light stuff lately, not requiring deep research and thinking. My activity on that is high, despite averaging one to three hours sleep and variable short naps. Yesterday I had occasion to try to add three to a number. Three. I failed. So I tried counting from the first number in three steps. I failed. So I tried doing it using my fingers. Success! I managed to function at a 4-5 year old level!

Its very hard to capture things like that in standard cognitive batteries.
If it makes you feel any better, the only way I can do math since this plague is using my fingers. Even the most simple. I used to be quite good.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
If it makes you feel any better, the only way I can do math since this plague is using my fingers. Even the most simple. I used to be quite good.
One ME friend of mine uses a calculator for even basic math. These problems are not uncommon, many patients report them. Indeed I first ran into this math problem about 1992 or so. Then it was occasional. Now its nearly all the time. I just had a real world moment the other day when I needed to figure out something. Its a useful anecdote.
 

Valentijn

Senior Member
Messages
15,786
A simple example, for POTS. Forget the tilt table test (it's unnecessarily unpleasant and demanding for a start). Instead have 24hr ecg followed up by a wrist strap or even a pulse oximeter for the patient to monitor and record day to day.
I agree ... a TTT can trigger a crash just as much as a CPET can. Unless the treatment is specifically aimed at OI treatment, I wouldn't see it as a good routine measurement to use.

I like the idea of using a pulse oximeter, but it would have to be matched up with data from an actometer. Then you can see if the heart rate is elevated due to activity, or if it's elevated even when resting, due to a recent crash. It's one of my favorite ways to monitor my capabilities, but definitely needs additional context to understand what's happening over time.
 

Jonathan Edwards

"Gibberish"
Messages
5,256
Since a Bayesian approach was mentioned, I am wondering about what this means in terms of prior probabilities of improvement and personalisation of measures. Eg in terms of improvement, someone with well below activity level, would be expected to return to the group/population mean, if the treatment was highly successful for example. But if on another measure, say some measure of OI, or heart rate variability, or sleep issues etc were more or less normal before the treatment, then requiring some degree of improvement on this measure to achieve some sort of improvement grade would not make sense.

Yes, I think that has to be factored in. In the ACR grading this is handled by having some measures as alternative options. So one measure is improvement of fatigue on visual analogue scale. Some patients do not think of themselves as having fatigue so they would opt out of that one and be scored on something else. There is still a pretty narrow range of options and some cases do fall foul of this - they may get a major benefit that cannot be scored. My thought would be that for ME the range of options and which things were optional and which compulsory would need to be hadnled a bit differently - but within the same general structural principles.

One thing that struck me is that one might see how a proposed composite measure lined up with the SEID or CCC criteria. I am not saying that it should, because the purposes are different, but it might be interesting just to ask why there are differences in range of features if there are. Should there be a visual analogue scale for how unrefreshing sleep is? Maybe not because it would be hard to quantify? Should PEM be excluded on the basis that people try to avoid it, or should some sort of PEM challenge be a compulsory feature? If both of those are tricky for SEID we get left with fatigue related symptoms as 1, major impact on daily activity as 2 and OI and thinking problems as options to test for 3?
 

user9876

Senior Member
Messages
4,556
The only problem with this approach is avoiding the statistical noise already discussed, due to day to day variation. Sampling over time is possible, but obviously places additional burdens on the participants.



Which actually brings up a key point. "Questionnaire fatigue". I don't mean in an illness sense, but in a psychological sense where people get overwhelmed with excessively long lists of questions and answer less accurately as a result.

I think rather than seeing day by day variation as statistical noise we should view day to day function as a time series. The problem of seeing it as noise is that it is likely to not be independent and a lot of stats techniques assume errors are independent often with Gaussian distributions.

Error on question answers is an important source of errors and its worth just asking a few questions with line type scales rather than lots. Likart's work suggested asking basically the same question a number of times hence it is valid to add up the results and this was intended to reduce errors on one answer. However, there is a danger that people just click down the same set of buttons on any survey or at the end of a survey - again this is problematic as errors are correlated.

If sensors are used they should have some understood error characteristics - small overall changes shouldn't be considered significant if they are within these likely error bounds.
 

Jonathan Edwards

"Gibberish"
Messages
5,256
I like the idea of using a pulse oximeter, but it would have to be matched up with data from an actometer. Then you can see if the heart rate is elevated due to activity, or if it's elevated even when resting, due to a recent crash. It's one of my favorite ways to monitor my capabilities, but definitely needs additional context to understand what's happening over time.

That is a nice way to resolve a similar worry I had. Inappropriately high heart rate would seem a reasonable physiological measure for the autonomic/vasoregulatory side. You might call it deconditioning but that would still be a real physiological change in the right causal path to confirm that a treatment was shifting the underlying mechanism I think. (And it seems that the deconditioning enthusiasts have mostly decided there isn't deconditioning anyway!)