• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of, and finding treatments for, complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia, long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

Measures of outcome for trials and other studies

Jonathan Edwards

"Gibberish"
Messages
5,256
Excuse me for being very thick here but can I just clarify by talking about my own case. For me beta- blockers and naltrexone are good symptomatic treatments, each controls one of my ME symptoms (POTS and sleep respectively) quite well.

So using this composite measure in a trial of naltrexone as an overall ME treatment I would (1) rate my overall improvement as say 10% because it acheives good control of one of my top five symptoms (2) my TTT shows no change (3) an my actometer data shows I'm a little bit more active because I am less groggy in the mornings but still have to pace carefully to avoid PEM

SO Naltrexone comes out as an ineffective treatment for my ME although it is good for a single symptom control. Which would be a true reflection in my case.

But if you tested beta blockers as an ME treatment I would (1) rate my improvement as 20-50% because controlling OI allows me to be much more active, up to my PEM limit (2) my TTT would be normalish (3) my actometer would confirm (1)

So beta blockers are a moderately effective treatment for my ME which is also true. So for the subgroup that has OI this test would be good.

Hypothetically a really good treatment (like rituximab) affects a number of my symptoms, including PEM,OI, sleep which would lead to a higher overall improvement rating still supported by 2 TTT & 3 actometer

Is that what we are saying? Sorry again to be so thick

Not thick at all, this is very relevant. There is no one answer to what measures to use and that needs to be kept in mind because people forget it. A trial of paracetamol (Tylenol) for pain in RA should not use ACR grading, just a pain scale. In fact the swollen and tender joint score part of ACR was invented for the ibuprofen trial, which reduces acute inflammation but not the chronic disease process. It may not be entirely suitable for testing more modern drugs like TNF inhibitors or rituximab. The tenderness is probably not very helpful but it was the thing that change most for ibuprofen maybe.

I think part of the idea of mixing different sorts of measure is to gain confidence that 'the treatment is having an underlying effect of the sort we hypothesised'. So we want evidence that the person says they feel better because their physiology has changed in a way we might have predicted.

There is an important false argument here, however, shown by the anti-IL-6 case. If we know treatment X tends to increase Y (some cellular or physiological measure) then measuring Y is no help as evidence that X is causing improvement via Y. You need to measure some Z that you thin kis on the way between Y and feeling better. So measuring NK cell function after ampligen is not helpful if we know that ampligen increases NK function. To increase confidence that patients are better because of this one would need to measure something that we think links NK function to feeling well in the context of ME but not in someone without ME. The main advantage of measuring NK function would in fact be to show that it might not correlate and that ampligen might be working some other way.

It all gets a bit complicated here!
 

Sidereal

Senior Member
Messages
4,856
With regard to neuropsychological testing, it's frequently used in psych studies of all sorts of conditions. I don't personally believe it's very useful because patients do not like doing the tests so you often end up with a lot of missing data which then makes meaningful interpretation of results difficult or impossible. Performance is heavily influenced by anxiety, motivation, fatigue, mood etc.

In ME/CFS performance is reasonably ok on standard neuropsychological testing and the general consensus in the psychiatric "CFS" circles is that objective measures do not corroborate patients' self-reported cognitive deficits. I would argue this is because of inappropriate study design. These tests were originally invented before brain imaging technology to localise brain lesions and such. In recent decades the tests have been (inappropriately and uncritically, in my view) applied to neuropsychiatric conditions where problems are much more subtle but still life-altering. There is already some evidence available that cognitive performance in ME/CFS degrades if you subject them to orthostatic stress while being tested. I would imagine the same thing would happen if the day before they underwent CPET or something like that.

Much more useful would be something along the lines of what @Marco is suggesting (ERPs).
 

Jonathan Edwards

"Gibberish"
Messages
5,256
Regarding this issue of measuring a change in symptoms, rather than measuring an symptoms in an absolute sense, I will just copy a post I made in another thread:



In summary: because of the brain fog and memory problems inherent inME/CFS, patients' own subjective, introspective judgment of their current health and symptom level may be unreliable. But asking patients to gauge their current health using natural objective measures in their daily life will probably be more dependable.

The only difficulty with this is that these natural objective measures will differ from one patient to the next, as the measures will depend on their lifestyle.

I actually think there is a deep reason for this being right. With my philosophy hat on I could say that there is actually no meaning to an absolute measure of a symptom of illness (this gets to Wittgenstein in the end). There has been a recent trend to believe that all these things can be measured as quantities like pounds of tomatoes but I think that is wrong. Fortunately, in practice it is change that we are interested in.

The problem of different measures for different people is manageable. I developed an ultimate solution in a paper I wrote on a trial we did in lupus in the 1980s - where each person had their own measures defined pre-trial. It causes no statistical problems of note and saves far more unnecessary work than it generates. But it does have some impact on interpretation. I still think that is manageable.
 

Jonathan Edwards

"Gibberish"
Messages
5,256
Quick note on neuro cog tests : Was in study where many tests were done in this area. But if you don't do the tests in the same order every time you can't compare. rough example: If math is first and reading comprehension and memory follow the first time. And the next time reading is first then memory then math. Math will be worse b/c energy now last when before it was first. I pointed this out to neuro psych who was doing this and they changed the study to be this way. Also to consider time of day. Some function at different levels based on time of day. Make sure same tests ( no matter which ones ) are done same time of day.

I have a feeling that psychologists love inventing complicated tests with complicated scoring protocols. I suspect this is to hide the fact that nobody is quite sure what they want to measure. I would prefer to find a single task that seemed to typify a particular problem for a particular person and make that their measure and keep it very stereotyped, as you say.
 

Sasha

Fine, thank you
Messages
17,863
Location
UK
From my perspective, most arguments against the 2 day CPET also apply to spinal taps. They are also dangerous. Does that mean we should not have spinal fluid studies? No process is perfect.

We really really need more work on outcome markers.

What I was talking about was not having dangerous tests (2-day CPET, spinal taps) as part of a standard battery of tests in all ME/CFS clinical trials.

If researchers want to do studies specifically on VO2 max or on spinal fluid, that's another story, but as a standard battery of tests, I think they're to be avoided.

(I don't mean to shout with that bolding but gentle italics don't show up well!).
 

Jonathan Edwards

"Gibberish"
Messages
5,256
But can we be confident that even the first day of testing is harmless?

I tend to agree that things like CPET are going to be off the menu as standard components of a scoring system even if there might be specialised uses or they might be used for individuals judged to be low risk as one of several options (even that I doubt really). I think we may well often want assessments at multiple time points to gain kinetic/longitudinal data as part of understanding mechanisms and apart from anything I think a CPET study is probably too time consuming and disruptive to be down say monthly for a year.

Edit: you got there first Sasha.
 

lansbergen

Senior Member
Messages
2,512
I think assessment of Alzheimer's is mostly objective because being able to add thirteen and twenty one or remember your address is not something likely to be biased by the way you feel.

I lost those skills when I got worse and worse but only during flares. Between flares I was good enough to make my medical trained contacts say I was not dementing.
 

fds66

Senior Member
Messages
231
I have not been able to read all the replies so forgive me if I just repeat what other people have said.

From my own personal experience:

I can reduce the intensity of my symptoms by being really careful about pacing myself but that is not really an improvement in the underlying condition. If I do a little more the symptoms just come back. So just measuring pain, nausea, exhaustion etc would not be an indication of underlying health for me.

I use a 3d pedometer every day and record how many steps I take. I have done this for years and I find the data helpful to fine tune my management. I find that it is one measure of how well I am doing but it does vary from day to day. An average over several days is sometimes a more reliable measure of my level of activity and I find is more solid in seeing the effects of changes in the level of activity. I can increase those steps temporarily but it will come to bite me eventually. I often use it to look back over the last few weeks to see if my activity has increased which often explains why I am feeling worse. So any pedometer type reading would have to be over a period of time to account for day to day variation and the delayed effects of consistent slight increases in activity which can take longer to hit me than a massive overdoing on one day. Sometimes the full impact will only be seen weeks or months later for very slight increases.

A pedometer can only track physical activity. I have yet to find a good, reliable and easy to implement way to track mental or emotional exertion. These can just as easily cause deterioration in my symptoms.

These are just observations from my attempts at monitoring and managing my condition.
 

user9876

Senior Member
Messages
4,556
One issue I can see is the timing of any testing. If you happen to test someone on a good day that could skew results. If you give people a set of questionnaires they may wait for a good day to do them. If tests are in a clinical environment people might rest up prior to going since they know it will be difficult but they may have difficulty getting there which could produce PEM.

So I was wondering about a strategy of fairly continuous but low level testing and recording. For example:
Giving people a cognitive test problem once or twice a day
Using accelorometers to monitor activity continuously
Getting people to fill out a simple line scale of things like fatigue, pain, their perceived recent activity levels
Some simple daily living task check boxes for what people have managed.

The idea would be to keep it quite light but to have say a smart phone app which people could use to do these things. There may be options to say 'forgot' or 'too ill to record' for missing data. I wonder if this would be relatively easy to do and give useful trend information rather than just a small number of one off tests. You may still want to do the additional one off testing.

Data analysis can then be performed across all the time series including looking for correlations and delayed correlations.

In terms of getting such a app written for a trial you could try to ask in a comp sci department if they could get students to develop them as part of their undergrad project work. But it might be too late for this year.
 

Jonathan Edwards

"Gibberish"
Messages
5,256
JE: I think assessment of Alzheimer's is mostly objective because being able to add thirteen and twenty one or remember your address is not something likely to be biased by the way you feel.

Lansbergen: I lost those skills when I got worse and worse but only during flares. Between flares I was good enough to make my medical trained contacts say I was not dementing.

JE: Fair enough. I guess I should have said biased by what you believe or by the way you feel about being in a treatment trial.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
Here is one of the confounds on cognitive testing. I have been writing a ton of light stuff lately, not requiring deep research and thinking. My activity on that is high, despite averaging one to three hours sleep and variable short naps. Yesterday I had occasion to try to add three to a number. Three. I failed. So I tried counting from the first number in three steps. I failed. So I tried doing it using my fingers. Success! I managed to function at a 4-5 year old level!

Its very hard to capture things like that in standard cognitive batteries.
 

alex3619

Senior Member
Messages
13,810
Location
Logan, Queensland, Australia
The use of a two day CPET goes way beyond studying CPET. Just as you would not use spinal taps to cross validate all studies, nor would you use a 2 day CPET. Yet if you wanted to validate a standard, or definition, then there should at least be a substudy using it in my view. This is entirely separate from researching the technique.

As I have said a bunch of times now, going back months, you don't even need the exercise component to show problems. Metabolic rate is measurable using gas analysis. You would not measure PEM or similar, but you would measure current energy production.
 

Jonathan Edwards

"Gibberish"
Messages
5,256
One issue I can see is the timing of any testing. If you happen to test someone on a good day that could skew results. If you give people a set of questionnaires they may wait for a good day to do them. If tests are in a clinical environment people might rest up prior to going since they know it will be difficult but they may have difficulty getting there which could produce PEM.

So I was wondering about a strategy of fairly continuous but low level testing and recording. For example:
Giving people a cognitive test problem once or twice a day
Using accelorometers to monitor activity continuously
Getting people to fill out a simple line scale of things like fatigue, pain, their perceived recent activity levels
Some simple daily living task check boxes for what people have managed.

The idea would be to keep it quite light but to have say a smart phone app which people could use to do these things. There may be options to say 'forgot' or 'too ill to record' for missing data. I wonder if this would be relatively easy to do and give useful trend information rather than just a small number of one off tests. You may still want to do the additional one off testing.

Data analysis can then be performed across all the time series including looking for correlations and delayed correlations.

In terms of getting such a app written for a trial you could try to ask in a comp sci department if they could get students to develop them as part of their undergrad project work. But it might be too late for this year.

Yes I like that sort of idea. Maybe a daily soduku? But only for sudokuphiles maybe. Cognitive tests on a daily basis might induce a significant learning effect, but then I never seem to get any quicker on the 'cahllenging' sudokus.

An ACR grading assessment takes about five minutes (apart from the time the lab take to process the ESR sample). I find it hard to believe that you cannot judge where someone is with ME in five minutes (apart from the time it takes for them to do the sudoku). The underlying physiological indices (like tilt table or bloodwork) might take longer but being measures of underlying change one would expect only to need those monthly at the very most.
 

user9876

Senior Member
Messages
4,556
Yes I like that sort of idea. Maybe a daily soduku? But only for sudokuphiles maybe. Cognitive tests on a daily basis might induce a significant learning effect, but then I never seem to get any quicker on the 'cahllenging' sudokus.

An ACR grading assessment takes about five minutes (apart from the time the lab take to process the ESR sample). I find it hard to believe that you cannot judge where someone is with ME in five minutes (apart from the time it takes for them to do the sudoku). The underlying physiological indices (like tilt table or bloodwork) might take longer but being measures of underlying change one would expect only to need those monthly at the very most.

I was thinking that there would be a learning effect but that could be measured in people who are well. I was also thinking of a range of tests, I remember the nintendo brain training game which had various ones such testing say speed of adding up, memory, word problems etc I was thinking they could be randomly chosen each day. It is the trend that is important but it may be that different types of tests prove better and if the data is tagged appropriately they can be pulled out separately.

The problem with judging someone with ME is around how much the condition fluctuates. I think doctors often get a bad impression since they only see patients when they are well enough to attend a clinic.
 

Valentijn

Senior Member
Messages
15,786
I was thinking that there would be a learning effect but that could be measured in people who are well. I was also thinking of a range of tests, I remember the nintendo brain training game which had various ones such testing say speed of adding up, memory, word problems etc I was thinking they could be randomly chosen each day.
I was thinking of that game too ... I have "Brain Age 2" for the Nintendo DS. I haven't played it in years because it's probably too depressing now :p
 

MeSci

ME/CFS since 1995; activity level 6?
Messages
8,231
Location
Cornwall, UK
Outcomes for studies on Alzheimers for example would be totally based on subjective measures. The problem is that ME/CFS is held to a higher standard based on scepticism about the actual disease and muddiness (some manufactured) around definitions and diagnosis.
So I think a salient question is what outcome measure is the most bullet-proof to criticism?

Ask the biggest ME sceptic you can find what measure they would accept as a positive result in a drug trial?

Slightly O-T, but results of drug trials in Alzheimer's are not necessarily just subjective. I have studied this, and measures include brain scans to see whether there is any neuronal regrowth or a slowing of loss, and also assessment of whether beta-amyloid plaques have been reduced in size.
 

duncan

Senior Member
Messages
2,240
@Jonathan Edwards

The thing is - at least as far as objective markers in neuro-cognitive tests - questions like what does 13 plus 21 equal, or what is your address, can be swayed by how you feel.

I have days where clarity is better than others, where memory is worse than others, IQ performance, etc - all these factors can be influenced based on how well I feel.
 
Last edited:

Sasha

Fine, thank you
Messages
17,863
Location
UK
So I was wondering about a strategy of fairly continuous but low level testing and recording. For example:
Giving people a cognitive test problem once or twice a day
Using accelorometers to monitor activity continuously
Getting people to fill out a simple line scale of things like fatigue, pain, their perceived recent activity levels
Some simple daily living task check boxes for what people have managed.

The idea would be to keep it quite light but to have say a smart phone app which people could use to do these things. There may be options to say 'forgot' or 'too ill to record' for missing data. I wonder if this would be relatively easy to do and give useful trend information rather than just a small number of one off tests. You may still want to do the additional one off testing.

Data analysis can then be performed across all the time series including looking for correlations and delayed correlations.

In terms of getting such a app written for a trial you could try to ask in a comp sci department if they could get students to develop them as part of their undergrad project work. But it might be too late for this year.

Just wanted to give a bigger thumbs up for this!

Patients duck and dive to pace themselves the whole time, both within a day and over days, and we rest up in advance and after big activities. Activity on any given day does indeed mean not much. But we can't cheat our bodies in the long run.
 

Gijs

Senior Member
Messages
691
I think PET scan would be the best objective measurement if replicated: Neuroinflammation in Patients with Chronic Fatigue Syndrome/Myalgic Encephalomyelitis: An 11C-(R)-PK11195 PET Study. Maybe you can put some patiënts in the rituximab trial to this test before giving this drug.
 

Jonathan Edwards

"Gibberish"
Messages
5,256
@Jonathan Edwards

The thing is - at least as far as objective markers in neuro-cognitive tests - questions like what does 13 plus 21 equal, or what is your address, can be swayed by how you feel.

I have days where clarity is better than others, where memory is worse than others, IQ performance, etc - all these factors can be influenced based on how well I feel.

Yes, I hoped that I had clarified that. Subjectivity is a problem if the way you feel about the context of being in a treatment study, or for an assessor being an assessor in a treatment study can bias your response. If you feel too ill to add up that is still an objective measure of illness in this sense. Or at least if you do get the right answer that excludes being too ill to add up in whatever sense. Behavioural responses do not need to be considered subjective in contrast to imaging studies. If someone can add that is an objective indicator that they can add. There are all sorts of layers to this but the real problem we need to overcome is the bias that can come from being in a trial and believing certain things as a result.