• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of and finding treatments for complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia (FM), long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

CFS: Prediction of complex human diseases from pathway-focused candidate markers...

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Prediction of complex human diseases from pathway-focused candidate markers by joint estimation of marker effects: case of chronic fatigue syndrome
Bhattacharjee M, Rajeevan MS, Sillanpää MJ
11 June 2015
Hum Genomics 9:8.
doi: 10.1186/s40246-015-0030-6.
http://www.ncbi.nlm.nih.gov/pubmed/26063326
http://www.humgenomics.com/content/9/1/8/abstract

Edited to add: Provisional PDF is available (not final version):
http://www.humgenomics.com/content/pdf/s40246-015-0030-6.pdf

Abstract
BACKGROUND:
The current practice of using only a few strongly associated genetic markers in regression models results in generally low power in prediction or accounting for heritability of complex human traits.

PURPOSE:
We illustrate here a Bayesian joint estimation of single nucleotide polymorphism (SNP) effects principle to improve prediction of phenotype status from pathway-focused sets of SNPs. Chronic fatigue syndrome (CFS), a complex disease of unknown etiology with no laboratory methods for diagnosis, was chosen to demonstrate the power of this Bayesian method. For CFS, such a genetic predictive model in combination with clinical evidence might lead to an earlier diagnosis than one based solely on clinical findings.

METHODS:
One of our goals is to model disease status using Bayesian statistics which perform variable selection and parameter estimation simultaneously and which can induce the sparseness and smoothness of the SNP effects. Smoothness of the SNP effects is obtained by explicit modeling of the covariance structure of the SNP effects.

RESULTS:
The Bayesian model achieved perfect goodness of fit when tested within the sampled data. Tenfold cross-validation resulted in 80 % accuracy, one of the best so far for CFS in comparison to previous prediction models. Model reduction aspects were investigated in a computationally feasible manner. Additionally, genetic variation estimates provided by the model identified specific genetic markers for their biological role in the disease pathophysiology.

CONCLUSIONS:
This proof-of-principle study provides a powerful approach combining Bayesian methods, SNPs representing multiple pathways and rigorous case ascertainment for accurate genetic risk prediction modeling of complex diseases like CFS and other chronic diseases.
 
Last edited:

Simon

Senior Member
Messages
3,789
Location
Monmouth, UK
Methodology is way over my head, and nice to see researchers taking a look at mecfs, but as @Tom Kindlon pointed out on twitter, this study uses the empiric criteria (from the 2007? Pharmacogenomics series). So findings aren't likely to mean much, sadly.

Tom on twitter said:
Study likely uses a CDC study on (overly broad i.e. 2.4% of popn.) "empiric" #CFS criteria (Reeves et al., 2005)
 

anciendaze

Senior Member
Messages
1,841
If it uses the Reeves "empiric" criteria we have a different problem, most physiological abnormalities are exclusionary conditions.
 

Denise

Senior Member
Messages
1,095
Tom on twitter said:
Study likely uses a CDC study on (overly broad i.e. 2.4% of popn.) "empiric" #CFS criteria (Reeves et al., 2005)

"Empiric"? Oh good grief! Do other fields allow continued work using discredited definitions/cohorts?
 
Messages
15,786
The Bayesian model achieved perfect goodness of fit when tested within the sampled data. Tenfold cross-validation resulted in 80 % accuracy, one of the best so far for CFS in comparison to previous prediction models.
This is a pretty important bit. If I'm understanding it correctly, they used one bunch of patients to look for associations of SNPs with CFS (unknown definition), and then used a 2nd group of CFS patients to see if their results from the first group held up in the 2nd group. I think this can be a good way of making sure that the first set of results weren't due to a false-positive.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
This is a pretty important bit. If I'm understanding it correctly, they used one bunch of patients to look for associations of SNPs with CFS (unknown definition), and then used a 2nd group of CFS patients to see if their results from the first group held up in the 2nd group. I think this can be a good way of making sure that the first set of results weren't due to a false-positive.
If they've actually done that, and it's replicable, then it could be very interesting. I'm not sure why I'm so suspicious that they haven't achieved what the abstract suggests.
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Methodology is way over my head, and nice to see researchers taking a look at mecfs, but as @Tom Kindlon pointed out on twitter, this study uses the empiric criteria (from the 2007? Pharmacogenomics series). So findings aren't likely to mean much, sadly.
I suppose it doesn't matter how broad the selection criteria are, if they are actively trying to determine genetic markers for a subset, or subsets. But there's no suggestion from the abstract that they are looking for subsets. BTW, are we sure they've used the empirical definition? It doesn't say so in the abstract, does it?
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Interesting that one of the coauthors (Rajeevan MS) is from the CDC:
"Division of High-Consequence Pathogens & Pathology, Centers for Disease Control and Prevention, Atlanta."
 

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Entry criteria. I think the following text might be saying that participants needed to fulfil Fukuda, and some additional criteria. But I'm not certain that's what they mean:
Bhattacharjee et al. said:
Briefly, 227 subjects were recruited from Wichita, KS, USA, as part of a 2-day in-hospital evaluation of unexplained fatigue. These subjects were identified from a surveillance cohort of 7162 fatigued and non-fatigued subjects who were originally screened from 56,146 adult residents, 18 to 69 years of age. During the 2-day hospital stay, symptoms and exclusionary medical and psychiatric conditions were reevaluated for all 227 subjects. Following the 2-day hospital study, all subjects were classified based on all aspects specified in 1994 CFS case definition [26] and Medical Outcomes Short-Form, Multidimensional Fatigue Inventory and Symptom Inventory cutoff scores to include measures on the functional impairment, fatigue, and accompanying symptom complex that characterize CFS. Following this classification, 124 subjects were excluded because of medical or psychiatric exclusionary conditions or insufficient criteria to classify as CFS. Of the 103 remaining subjects, 101 subjects with genotype data were classified as CFS (43 subjects) and non-fatigued (NF; 58 subjects) healthy controls.
 
Last edited:

Bob

Senior Member
Messages
16,455
Location
England (south coast)
OK, I take it all back. It seems like it could be a very interesting study after all. Except that I understand almost none of it, so my insight into its potential is very limited.

This study included 43 CFS patients, and hopefully they'll be able to follow it up with a larger study, as I suspect that the usefulness of the results and methodology is potentially very limited with such a small cohort.
 
Last edited:

Bob

Senior Member
Messages
16,455
Location
England (south coast)
Bhattacharjee et al. said:
SNP selection, genotyping, and annotation

Because of the reported associations of CFS with perturbations in hypothalamic-pituitaryadrenal (HPA) axis and immune functions, we selected a total of 39 candidate genes implicated with the central nervous system (CNS) (30 genes) or immune and inflammation functions (nine genes) to determine the accuracy CFS prediction based on combinations of SNPs (Additional file 1: Table S1). There were a total of 167 SNPs in all candidate genes (137 SNPs in genes implicated with CNS and 30 SNPs in genes implicated with immune and inflammation). There were a total of 23 SNPs in X-chromosomes, and these were in two genes of serotonergic neurotransmission (HTR2C and MAOA).
 

anciendaze

Senior Member
Messages
1,841
We still have a problem with the question of what they were studying. Remember that both the Oxford Criteria and the CCC are said to implement Fukuda.

I have to warn people that Bayesian techniques use a mixture of inflexible logic and statistical inference. It is extremely important to get the logic correct. If you don't you can force bizarre conclusions on meaningless data. This does not detract from the usefulness in cases like codebreaking, where the logic is unambiguous, or in biochemical cascades where you are very certain the chemicals in question pass through a known sequence of reactions. What is important to realize is that those facts are not under test, they must have independent support. These methods are power tools, and must be used with appropriate caution.

The interesting thing to me is that the SNPs they list are almost all in introns. The one exception is a change to a codon which is a synonym for the same amino acid. The result is that conventional genomics would say that these would have no effect on the structure of proteins resulting from transcription of genes. If there is anything going on, it must be epigenetic.

That could be very interesting, or it could be the result of "amplifying noise".
 

Simon

Senior Member
Messages
3,789
Location
Monmouth, UK
Entry criteria. I think the following text might be saying that participants needed to fulfil Fukuda, and some additional criteria. But I'm not certain that's what they mean:

They quote Fukuda but the Empiric too: this hospital-tested sample is the same one that starred in the 'Empiric' paper with its wacko implementation of Fukuda.
Bhattacharjee said:
Subject recruitment, clinical evaluation, laboratory tests, and their classification were described previously [25]...

25. Reeves WC, Wagner D, Nisenbaum R, Jones JF, Gurbaxani B, Solomon L, et al. Chronic
fatigue syndrome—a clinically empirical approach to its definition and study. BMC Med.2005;3:19.

I suppose it doesn't matter how broad the selection criteria are, if they are actively trying to determine genetic markers for a subset, or subsets. But there's no suggestion from the abstract that they are looking for subsets.
And with 43 patients/58 controls in their model, they have no chance of finding subsets, even if they existed.

This is a pretty important bit. If I'm understanding it correctly, they used one bunch of patients to look for associations of SNPs with CFS (unknown definition), and then used a 2nd group of CFS patients to see if their results from the first group held up in the 2nd group. I think this can be a good way of making sure that the first set of results weren't due to a false-positive.
That's not quite cross-validation as I understand it. Ideally you take your full sample and split into a training set, for developing a model, then test on the remaining validation set of patients. Cross-validation does this internally, repeatedly splitting the whole sample into random 'training' and 'validation' sets. In this case they split the whole sample randomly into two groups ten times, and ran the model ten times (on average each patient will appear 5 times in a training set and 5 times in a validation set). While this is better than no validation, it turns out to be a less reliable way of validating the model than doing it properly, where patients are assigned EITHER to training or validation sets. Also, because running the model ten times required a lot of computational power, they had to modify the model to make things easier. Not so promising.

But maybe the methodology used here could be very powerful if used with a larger dataset and more reliable case definition.
 
Last edited:

Bob

Senior Member
Messages
16,455
Location
England (south coast)
The genetic aspect of ME/CFS is something I've never taken a keen interest in, probably because the science is in its infancy, and ME/CFS research has not been consistent or replicable, as far as I'm aware. I used to be fascinated by genetics, in general, whenever reading about it in the past, but I'd never come across the term 'SNP' until a few years ago, in relation to Jonathan Kerr's work. At the time, I thought I'd read up about the term 'SNP' to see what it means, and I thought it was too complex to get to grips with, so I abandoned it, and have always felt a bit stupid about genetic research ever since. But I've just had another look at 'SNP', on Wikipedia, and it's the easiest possible concept! Why did I think it was so complicated?! For those, like me, who have avoided reading about it, thinking it was too complicated, 'SNP' (pronounced 'snip') simply refers to a variation in a DNA sequence in which there is a difference in a single base pair (or nucleotide i.e. A,T,C or G) within the DNA sequence. So it just refers to a single difference in the code of a DNA sequence. (Unless I've misunderstood.) The wiki article has a good basic overview in the opening paragraph.
 
Last edited: