Measures of outcome for trials and other studies

Jonathan Edwards

"Gibberish"
Messages
5,256
Good. So the issue isn't with the use of subjective measures but with blinding.

Actually the issue is with the combination. Blinding plus subjectivity is fine. Unblinding plus objectivity is fine. The function of blinding is to eliminate the bias from subjectivity so it is subjectivity without blinding that is not fine.

I am not saying that this was not already clear to you, but if it was not you would have been in eminent company it seems!
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Actometry/ actigraphy seems to have a lot going for it here although it may need to be based on an iWatch7 app not yet invented to be sensitive and flexible enough.
Were you aware that there are various fitness monitors available on the market with associated apps, that can measure total daily activity, and various other variables? e.g. they can count steps made (walking/running), estimate distance walked, and measure time slept. I don't know if any of them have been tested in terms of quality and reliability for research purposes. Some products were tested for accuracy on the BBC, and there was quite a degree of variability between brands, and they weren't entirely accurate at counting steps. Their presenter ran 300 steps to test various products. The FitBit measured 262 steps instead of 300. Another product (Garmin) did better and counted 293 steps. I suppose what's most important for research is that each individual product has a limited and reliable margin of error.

Here's the BBC trial in case anyone is interested (see video):
http://www.bbc.co.uk/news/technology-31113602
 
Last edited:

daisybell

Senior Member
Messages
1,613
Location
New Zealand
Were you aware that there are various fitness monitors available on the market with associated apps, that can measure total daily activity, and various other variables? e.g. they can count steps made (walking/running), estimate distance walked, and measure time slept. I don't know if any of them have been tested in terms of quality and reliability for research purposes. Some products were tested for accuracy on the BBC, and there was quite a degree of variability between brands, and they weren't entirely accurate at counting steps. Their presenter ran 300 steps to test various products. The FitBit measured 262 steps instead of 300. Another product (Garmin) did better and counted 293 steps. I suppose what's most important for research is that each individual product has a limited and reliable margin of error.

Here's the BBC trial in case anyone is interested (see video):
http://www.bbc.co.uk/news/technology-31113602
I have a Fitbit and I know it over-reads... It picks up any vigorous arm movements and counts these as steps. So I don't interpret the count as steps taken but more as a measure of general activity. If I've been grating veg, or stirring a pot then the Fitbit counts... But I do get an overall measure of activity, which varies little from day to day and does reflect how good I am feeling.

The important thing I think in any measure of actimetry is that everyone uses exactly the same device in exactly the same way. The absolute numbers may represent all sorts of movement, but the change for each person should be meaningful.
 

Jonathan Edwards

"Gibberish"
Messages
5,256
Were you aware that there are various fitness monitors available on the market with associated apps, that can measure total daily activity, and various other variables? e.g. they can count steps made (walking/running), estimate distance walked, and measure time slept. I don't know if any of them have been tested in terms of quality and reliability for research purposes. Some products were tested for accuracy on the BBC, and there was quite a degree of variability between brands, and they weren't entirely accurate at counting steps. Their presenter ran 300 steps to test various products. The FitBit measured 262 steps instead of 300. Another product (Garmin) did better and counted 293 steps. I suppose what's most important for research is that each individual product has a limited and reliable margin of error.

Here's the BBC trial in case anyone is interested (see video):
http://www.bbc.co.uk/news/technology-31113602

I haven't done any research into this but I suspect that the technology is close to what would be needed to provide a reasonably robust picture of daily activities. Commercial apps are probably designed to extract stereotypic analyses like 'steps taken' but presumably the raw data could be used in more sophisticated ways if a research group looking at ME teamed up with the software designers. I suppose that camera- based data would be pretty informative although it might seem intrusive. I haven't been thinking about detailed practicalities.
 

MeSci

ME/CFS since 1995; activity level 6?
Messages
8,235
Location
Cornwall, UK
Commercial apps are probably designed to extract stereotypic analyses like 'steps taken' but presumably the raw data could be used in more sophisticated ways if a research group looking at ME teamed up with the software designers.

If we were going to measure steps taken, we would also need to measure stride length. For me this varies - on a good day my stride is longer than on a bad day. I think that Fitbits etc. can measure this, but I hope they would be easier to figure out than the monitor I have, where I have just about figured out how to get a heart rate reading!
 

Marco

Grrrrrrr!
Messages
2,386
Location
Near Cognac, France
OK so we are looking for a compound set of measures (cross validating and should correlate) that capture the key features of ME/CFS; are robust enough to be used in blinded and unblinded trials; would ideally include an element of personalisation and would also ideally be comparable to previous studies.

Physical function (subjective)

SF36 PF does seems to be reasonably robust at capturing limitations in physical function and is well known, understood and comparable to previous studies/norms. But as a subjective measure is subject to bias.

Somewhat contradicting my earlier statement that functionality is more important to patients than symptoms, anecdotally I spent 7 years working full time when the severity of my symptoms was such that I really shouldn't have been working at all. You can only carry on like that so long but it does suggest that there can be an apparent disconnect between the symptoms experienced and apparent functionality. I'm sure this is a familiar scenario to many PWME.

So a measure of symptoms would be desirable - but which symptoms from the myriad possiblities?

Symptoms (subjective)

Both the IOM and Lenny Jason agree that a short list of symptoms is sufficient for diagnostic purposes.

Jason suggests that just three core symptom domains — post-exertional malaise, sleep and cognitive functioning — are necessary to discriminate between ME/CFS and other disorders

The SEID criteria require :

A substantial reduction or impairment in the ability to engage in pre-illness levels of occupational, educational, social, or personal activities, that persists for more than 6 months and is accompanied by fatigue; post-exertional malaise, and unrefreshing sleep. At least one of the two following manifestations is also required : cognitive impairment OR orthostatic intolerance

Both the IOM and Jason recommend that frequency and severity are measured.

Some fatigue scales suffer from ceiling effects and it could be argued that 'fatigue' is already adequately measured by the reduction in physical function measured by SF36 PF (if it covers occupational, educational, social, personal activities etc).

That leaves just PEM, unrefreshing sleep and either cognitive impairment or orthostatic intolerance.

Tailored measures

One frequent criticism of the SEID criteria was that 'it doesn't describe my illness'. Many common and often disabling symptoms don't feature as key symptoms. I can sympathise to an extent. Temperature regulation is a major issue for me and for much of the year is the major limitation on my activities/symptoms. So in addtion to the 'core' symptoms individuals could nominate another symptom which they feel has a major impact on their health.

Objective measures

The absence of physiological abnormalities is the issue. Objective performance measures can and have been used including the 6MWT, CPET, 2 day CPET etc. Ethical/safety issues aside these are snapshots and also cumbersome an unsuitable for continuous measurement. Actimeter/fitbit type measures of activity could be used for extended periods. As discussed their accuracy varies but as long as its a constant error that shouldn't preclude measuring relative change (everyone would need to use the same model I suspect).

The SEID criteria propose orthostatic intolerance as an optional symptom and there's quite a collection of studies suggesting impaired autonomic function. I'm not convinced there's an easy way to measure this outside of the lab so any objective measures would again have to be periodic.

Cognitive impairment is included as a subjective symptoms but should also be measurable in 'real world' challenges that involve processing speed and multi-tasking (that increases 'cognitive loading').

An ME/CFS battery?

What we could end up with is a battery of subjective and onjective tests measuring function/activity and symptoms periodically and continuously. Something like :

  • SF36-PF at start, middle and end points (subjective)

  • Cognitive measures at start, middle and end points (objective)

  • Autonomic function start, middle and end points (objective)

  • Continuous (daily?) monitoring of a short list of key symptoms including one individual choice (subjective)

  • Continuous, or perhaps over discrete 1 week periods, activity monitoring (objective).


Of course some thought would have to be given to how these are scaled and score to produce thresholds amenable to construct a single composite outcome measure.
 

user9876

Senior Member
Messages
4,556
I haven't done any research into this but I suspect that the technology is close to what would be needed to provide a reasonably robust picture of daily activities. Commercial apps are probably designed to extract stereotypic analyses like 'steps taken' but presumably the raw data could be used in more sophisticated ways if a research group looking at ME teamed up with the software designers. I suppose that camera- based data would be pretty informative although it might seem intrusive. I haven't been thinking about detailed practicalities.

It may depend on the device whether raw data is available. Given the low power and storage constraints they may have to apply their algorithm to work out the number of steps immediately and not store raw data. A question about errors also comes in whether the device has similar error rates for the same person (i.e. does it always over estimate for one person, but under estimate for another).
 

user9876

Senior Member
Messages
4,556
Actually the issue is with the combination. Blinding plus subjectivity is fine. Unblinding plus objectivity is fine. The function of blinding is to eliminate the bias from subjectivity so it is subjectivity without blinding that is not fine.

I am not saying that this was not already clear to you, but if it was not you would have been in eminent company it seems!

I'm not sure it is that simple. There was a placebo study that seemed to get good patient assessments but that contradicted the physical measures and I think it was blinded.

http://harvardmagazine.com/2013/01/the-placebo-phenomenon

That paper (praised by scholars as one of the most carefully controlled and definitive placebo studies ever done) described a study of 40 asthma patients given four different interventions: active treatments with real albuterol inhalers; placebo treatments with fake inhalers that delivered no medication; sham acupuncture treatments; and intervals with no treatment at all. The patients returned for 12 sequential visits, receiving each type of treatment three times—a novel approach in placebo study that created a large amount of data (480 treatments in total) and turned subjects into their own controls (if patients are compared to themselves from one treatment to the next, researchers can eliminate subjects’ individual differences as a variable). The researchers had hoped to find improved lung function with both the real and sham treatments; what they found instead was that only the real treatment yielded results—the others showed no significant improvement. Yet when Kaptchuk’s team measured patients’ own assessments of improvement, the researchers found no difference reported between the real and sham treatments: the patients’ subjective responses directly contradicted their own objective physical measures.
 

Snow Leopard

Hibernating
Messages
5,902
Location
South Australia
Actually the issue is with the combination. Blinding plus subjectivity is fine. Unblinding plus objectivity is fine. The function of blinding is to eliminate the bias from subjectivity so it is subjectivity without blinding that is not fine.

Having said all this, if the overall measure is robust, then I don't see why it wouldn't also be useful for blinded trials.
 

eafw

Senior Member
Messages
936
Location
UK
Just another quick thought on an easy "DIY" way to measure recovery for a lot of us: alcohol tolerance. If whatever weirdness is going on with that gets fixed, then something significant has happened.
 

eafw

Senior Member
Messages
936
Location
UK
I suspect that the technology is close to what would be needed to provide a reasonably robust picture of daily activities

Again it depends what sort of accuracy you are after or if long term trends would be good enough.

Someone doing 2000 steps a day at the start of the trial and then six months later is doing 4000 and reporting they feel quite a bit better, not so many "bad" days and they can now pop to the Post Office or corner shop without too much payback - that is a meaningful improvement.

Even if the measurements are 10%, or at worst 20% off. You will still see a trend over time that is useful. Or if they merely said they felt a bit better but steps were 2000 vs 2400 and they weren't going out anymore than previously, then that is not really an improvement.

Would this be good enough ? Because it it very cheap easy to measure this sort of thing, and quite a few of us are probably already doing something similar already.

Add in another (perhaps slightly more rigourous ?) physiological measure like cortisol or HRV or 24 hr ecg or muscle strength or balance score or temperature regulation or .... and there are several markers all pointing in the right direction.
 

eafw

Senior Member
Messages
936
Location
UK
Another thing to consider (back to the importance of subgroups) is that for those in the early stages of the illness there is likely to be a quite different set of physiological patterns and markers than for those who the illness has set in very long term.

The recent cytokine study, the first three years they used a a cut-off, maybe that is the window for say rituximab to work well. So much longer than that and it's likely you'd be looking at something else
 

MeSci

ME/CFS since 1995; activity level 6?
Messages
8,235
Location
Cornwall, UK
I won't quote your whole long post, @Marco, but it looks to me like a good summary - clearly a lot of work has gone into that! :thumbsup:

I expect this has been mentioned before - maybe by me (who am I again? o_O:lol:) but I think a lot of us find that we vary significantly over the course of a day, both cognitively and physically, and to some extent we are commonly better/worse at some times of day than others. So measurement times/frequencies would have to take account of that.
 

Snow Leopard

Hibernating
Messages
5,902
Location
South Australia
Marco; for a start I find this abstract completely incomprehensible. I can get no feel for what any of the measures are telling us because there are no actual data given. That to me is always a sign of phoney science. But maybe the more important point is that there is no trial here. There is no source of bias to come under that wide umbrella we call 'placebo effect'. Patients were not thinking they might have had a good treatment or a dummy.

The whole "there was no trial here" is the key factor.

Finding that the two measures were comparable in a case-control study outside of a clinical trial and finding that they are comparable during a clinical trial (with various biases imposed) are two separate findings.
 

eafw

Senior Member
Messages
936
Location
UK
I expect this has been mentioned before - maybe by me (who am I again?)

We could do with a summary of the state of play in the thread so far, I know I've pretty much lost the plot now in terms of what's what and what's needed.

Prof Edwards, as the OP - what's your take, have your initial questions been properly addressed and any follow up questions or clarifications wanted ?
 

eafw

Senior Member
Messages
936
Location
UK
Other things to measure that may reflect health and activity levels: muscle bulk.

This is being discussed a little in another thread at the moment but am reminded that I "accidentally" had mine measured when having an ultrasound scan on a dodgy knee. Asked the technician if the band of fibres we could see was my muscle and it wasnt, the 3cm (on screen) band was my tendon with about 1mm of muscle barely visible above it.

Looked it up and apparently it's a reliable (and cheap and easy) way to measure it

http://www.ncbi.nlm.nih.gov/pubmed/11834114
 

voner

Senior Member
Messages
592
Marco, nice work! thank you.

I want to clarify the term "Exertion".

here is the IOM wording for "Post-exertional malaise".....

....an exacerbation of some or all of an individual’s ME/CFS symptoms after physical or cognitive exertion, or orthostatic stress that leads to a reduction in functional ability.....

so the exertion can be cognitive or physical (or an orthostatic stressor). Exertion does not necessarily mean activity. Exertion could be muscular exertion without any walking. The exertion could be from the orthostatic stress of standing for too long of a period or an cognitive task. An activity monitor would not pick up these exertions that causes PEM.

To me, the question is, "would the activity monitor detect the postexertional malaise period?

My guess is that it usually would, but is that sufficient for a clinical trial???

Finally, for the most severely ill who are bed and housebound, and activity monitor's data becomes more problematic.

I have been thinking of this subject because right now it is the begining of gardening season where I live and I am very aware of the fact that a little digging in my garden will produce PEM, but would not be measured on a activity monitor. I sit and stare at those garden beds wondering if the effort will be worth the malaise? A type of calculation I'm sure we all are constantly making.
 
Last edited:
Back