Genetic Predisposition for Immune System, Hormone, and Metabolic Dysfunction in ME/CFS: A Pilot Study. Perez et al 2019

percyval577

geometrical disaster
Messages
1,256
Likes
1,677
Location
Ik waak up
Link to the Study

Genetic Predisposition for Immune System, Hormone, and Metabolic Dysfunction in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Pilot Study
Melanie Perez
1,
Rajeev Jaundoo2,3,
Kelly Hilton
1,
Ana Del Alamo1,3,
Kristina Gemayel1,
Nancy G. Klimas
1,3,4,
Travis J. A. Craddock
1,2,3,5* and
Lubov Nathanson
1,3*


From the Results and Discussion
While there is an overlap between the metabolic and immune modules, the hormone module remains isolated with main connections only formed via Ovarian Steroidogenesis and the Wnt signaling pathway. Finally, there is a group of loosely connected pathways involved in an extracellular matrix organization.

While this organization highlights the interplay between immune, hormone and metabolic activity underlying ME/CFS, overlay of the location of CADD scores illustrates where the most deleterious effects occur (...).

Of the 11,485 SNPs that passed prefiltering according to the annotations (...), 8,593 SNPs had frequency more than 10% in either reference or ME/CFS cohort. Out of them, 5,693 SNPs had a two-fold difference between ME/CFS and the reference cohorts in either direction (...).

SNPs that prevailed in ME/CFS cohort were scored using the CADD algorithm (...). According to the CADD algorithm, C-scores above 10 indicate that these SNPs are predicted to be among the 10% most harmful, and C-scores above 20 indicate the 1% most deleterious substitutions (...). Table 1 shows 50 SNPs that are the most frequent in the ME/CFS cohort and have C-scores above 10.

Of the 50 most frequent deleterious SNPs found in our ME/CFS cohort compared to the reference database (...), 10 were found to have a frequency of 70% or more in the ME/CFS group. This includes CYP2D6, PRRT4, and PRSS56 at a frequency over 90%, C14orf37, ANKDD1B, at over 80%, and GPBAR1, LHB, ADAMTS19, VARS2, and CPLX2 at over 70%.
 
Last edited:

pattismith

Senior Member
Messages
3,148
Likes
5,967
Results and Discussion

Functional analysis of SNPs identified three main clusters of pathways as sharing at least 30% SNP related genes (Figure 1).
The first is dominated in size via the pathway Cytokine Signaling in Immune System and includes other immune-related pathways such as interferon signaling, autoimmune responses, and T-cell receptor signaling. This cluster highlights a module of immune-related SNPs.


The second cluster is dominated in size via the Nuclear-Receptors Meta-Pathway and includes hormone related pathways such as steroid hormone, estrogen, and androgen biosynthesis, glucuronidation, and the pregnane x receptor pathway. This cluster highlights modules of hormone-related SNPs.


The final cluster is dominated in size by Pathways in Cancer, however, closer inspection shows many metabolic processes such as enzyme reactions (protein kinase A, calcium and calmodulin signaling), and G proteins signaling which regulate metabolic enzymes, which are all involved in the regulation of glycogen, sugar and lipid metabolism. This cluster highlights a module of metabolism-related SNPs.


While there is an overlap between the metabolic and immune modules, the hormone module remains isolated with main connections only formed via Ovarian Steroidogenesis and the Wnt signaling pathway. Finally, there is a group of loosely connected pathways involved in an extracellular matrix organization.


While this organization highlights the interplay between immune, hormone and metabolic activity underlying ME/CFS, overlay of the location of CADD scores illustrates where the most deleterious effects occur (Figure 1; lower panel).
 

pattismith

Senior Member
Messages
3,148
Likes
5,967
"Of the 50 most frequent deleterious SNPs found in our ME/CFS cohort compared to the reference database (Table 1), 10 were found to have a frequency of 70% or more in the ME/CFS group. This includes CYP2D6, PRRT4, and PRSS56 at a frequency over 90%, C14orf37, ANKDD1B, at over 80%, and GPBAR1, LHB, ADAMTS19, VARS2, and CPLX2 at over 70%.


CYP2D6 (Cytochrome P450 2D6) is primarily expressed in the liver, but also highly expressed in areas of the central nervous system, including the substantia nigra, and is one of the most important enzymes involved in the metabolism of xenobiotics in the body. A significantly higher frequency of polymorphisms CYP2D6 was found in ME/CFS study subjects with Fibromyalgia than in controls and could differentiate these study subjects s from study subjects with multiple chemical sensitivity (27). CYP2D6 was found in the xenobiotics metabolism, androgen and estrogen biosynthesis and metabolism, tyrosine metabolism, codeine and morphine metabolism, oxidation by cytochrome P450, metapathway biotransformation phase I and II, and cytochrome P450—arranged by substrate type pathways all of which belong to the hormone related cluster.


PRSS56 (putative serine protease 56) is a serine protease that has been implicated in human eye development (28) and in the regulation of cerebellum activity of mice in exercise (29). It was not found to be a member of any of the annotated pathways.


GPBAR1 (G Protein-Coupled Bile Acid Receptor) functions as a cell surface receptor for bile acids and participates in the production of intracellular cAMP and activation of a MAP kinase signaling pathway. This receptor plays a big role in the suppression of macrophage functions and regulation of energy homeostasis by bile acids (30). Finding of the deleterious SNP in GPBAR1 (Table 1) is in agreement with the results of the recent study that showed disturbances in bile acid metabolism in ME/CFS study subjects (31). GPBAR1 was not among any of the pathways annotated.


LHB (luteinizing hormone beta polypeptide) is expressed in the pituitary gland and is essential for spermatogenesis and ovulation by stimulating the testes and ovaries to synthesize steroids (32, 33). LHB was found among the GnRH signaling pathway and ovarian steroidogenesis pathway.


ADAMTS19 is a member of the large ADAMTS (a desintegrin-like and metalloprotease with thrombospondin type 1 motif) family of metalloproteases (metal binding enzymes). ADAM proteins are responsible for the proteolytic cleavage of many transmembrane proteins and the release of their extracellular domain. ADAMTS19 is considered as a possible candidate for premature ovarian failure (34). Only the O-linked glycosylation pathway was found to contain ADAMTS19.


VARS2 (valyl-tRNA synthetase 2, mitochondrial) is important for the mitochondrial protein synthesis. Mutations in this gene are associated with cardiomyopathy (35), microcephaly and epilepsy (36), deficiency of the mitochondrial respiratory chain complex I and oxidative phosphorylation deficiency (37). VARS2 was not found among any of the annotated pathways.


CPLX2 gene encodes the complexin 2 protein that participates in neurotransmitter release by directly interacting with the neuronal SNARE complex (38). CPLX2 is known to be overexpressed in aging and downregulated by sleep deprivation (39), and this shows a connection of CPLX2 expression to fatigue. CPLX2 was also not found among any of the annotated pathways.


The remaining genes PRRT4, C14orf37, and ANKDD1B are obscure without much literature to support their function and not found among any of the annotated pathways. It was determined that PRRT4 (proline-rich transmembrane protein 4) showed biased expression in adult ovary, lung, adrenals, CNS and whole brain, while C14orf37 showed bias in brain, kidney, and ovary (ncbi.nlm.nih.gov). Little information was found for ANKDD1B (ankyrin repeat and death domain containing 1B).


Although SNPs in MYBPC3 and HLA genes have lower frequencies in ME/CFS cohort (0.19 for MYBPC3 and 0.13-0.44 for various HLA isoforms, respectively), these SNPs could be used for subgrouping of ME/CFS study subjects in larger studies because of their possible association with ME/CFS and fatigue. Multiple deleterious SNPs in HLA genes are in agreement with known impairment of the immune system in ME/CFS (40). Increased frequency of HLA-DQA1 alleles and decreased expression of HLA-DRB1 was found to be associated with ME/CFS (41). MYBPC3 (myosin binding protein C, cardiac) dysfunction is also associated with hypertrophic cardiomyopathy and corresponding fatigue (42)."
 

kday

Senior Member
Messages
369
Likes
349
I am not going to be critical right now, but that 10% frequency made them miss CASP8. :(

In my small sample pool of ME/CFS genomes, it's highly prevalent. Prevalent to the point that I wonder if it could somehow be a cause or causative factor of the NK Cell Function Deficiency. Somebody needs to study it.
 

kday

Senior Member
Messages
369
Likes
349
The very first SNP in a table on that study (GPBAR1 rs199986029) has a global allele frequency of 0.000004081 in gnomAD. Literally only 1 out of 245,032 didn't match reference. 1 heterozygote worldwide in this database.

https://gnomad.broadinstitute.org/variant/2-219128356-G-C

If you combine the other frequency databases, up to 4 in the world have it. But the same people can be in different databases. So it's probably 2 people total.

This was literally one of the first things I examined in the study. I kept looking to see if I typed the rsID wrong.
 
Last edited:

percyval577

geometrical disaster
Messages
1,256
Likes
1,677
Location
Ik waak up
I think if the finding is important, than it is in that sense that there wouldn´t be any certain single SNP´s which would cause ME/CFS.

Instead three "modules" are vulnerable to get out of balance, and it´s a pool of possible SNP´s.
The effect can even be on another now easily affected module which would account for main symptoms (PEM, I would claim). Or even the modules can get out of balance by an trigger that also could work firstly on a healthy part (module) of the body, maybe even on a single pathway that would be vulnerable to new inventions, chemicals, new properties of pathogenes, say only some million years old.

The system is even more complicate, they say there are three modules in a row vulnerable, two of them often directly conneceted, plus two also dense enough spots, one being an/the extracellular matrix.

(Of difficulty might seem that the metabolic cluster is of another category than extracellular matrix, hormones, and immunesystem. The metabolic module is well a part of the latter three, as it is more or less part of every subsystem of the body, I would think.)

I personally can not judge at any details (@kday).
 
Last edited:

kday

Senior Member
Messages
369
Likes
349
Functional analysis identified the majority of SNPs as related to immune system, hormone, metabolic, and extracellular matrix organization.
If you query a database and order by CADD score and run this through another database, you get something that looks like a creature with a immune system, nervous sysytem, collagen, brain, bone structure, etc. You get something that closely resembles a mammal pr primate. I only know because I've done it before.

I've done it with this database specifically: http://ctdbase.org/tools/analyzer.go
 
Last edited:

kday

Senior Member
Messages
369
Likes
349
Back to the bright side of things. I think it's cool that we're using these new tools to look at the genome. I am a fan of CADD scores too and think they can be useful if utilized well.

These are some really new tools and it's clear that we need to learn how to utilize them. And most importantly, you need to know how to filter out the bad data. This of course can be difficult and time consuming, especially when you are working with consumer genomic files that lack simple things like reference and alt alleles, use properietary identifiers, and report on strands that may or may not be consistent with dbSNP or Whole Genome Sequencing data.
 

percyval577

geometrical disaster
Messages
1,256
Likes
1,677
Location
Ik waak up
If you query a database and order by CADD score and run this through another database, you get something that looks like a creature with a immune system, collagen, and a bone structure.
(It might be stupid what I will say.) So after all the different interpretation of data would lead at least to three modules, and even two of them would be the same. So this wouldn´t be too bad, for now and considering that we have just started to explore the human genom say some twenty years ago. (But you have given a phanasy example I strongly guess).

Or are you saying that even the data itself may not be reliable?


I remember there was a technique to recognize genes when it is difficult to look up every single base. Prerequisite was to know the possibilty with which sequencies occure, so already empirical knowledge was needed. It ran by hidden markov chains, don´t ask me for details.

I am not interested in hidden markov chains and genetics, but on the idea that out of an uncertainty you can nevertheless make a reliable idea of what´s going on. So, the concrete interpretation of the data might be even unimportant enough. The other study on genetics from 2016 (Schlauch et al) found also statisticly a distribution of SNP´s in PwME, now most of the SNP´s in non-coding regions (whatever this would cause).

True, the sudies are in some details in odds to each other, if I am remembering rightly, and this study 2019 didn´t look at non-coding regions, but the overall result seems to be the same, there are widespread possible SNP´s. ( The sample size might not already be big enough, n=80 or n=383).

However it would give an idea - if other diseases wouldn´t show up like so. Huntington e.g. doesn´t, there it is well known that one sequence is simply too long.
 
Last edited:

kday

Senior Member
Messages
369
Likes
349
There certainly can be a use case scenario for doing such a things with CADD scores. I'm not sure exctly what they all would be. But if you are trying to just see the overall picture, I don't think such techniques can provide that.

You can get a general picture of what body processes are influenced by a certain gene, if that makes any sense. But I don't understand how a broad look at many different genes in related to body processes will provide anything useful. You could potentially discover a causative SNP or mendelian disease like this, I suppose. But the methodology needs to be well thought out. And you need to be dilligent about filtering signal from noise, which this study did not do at all.

The the general concept of the study isn't bad, there just needs to be a lot more care when evaluating the data. In regards to the integrity of their data set, they have no good choice but to start over.
 
Last edited:

percyval577

geometrical disaster
Messages
1,256
Likes
1,677
Location
Ik waak up
But I don't understand how a broad look at many different genes in related to body processes will provide anything useful.
I think that we will have to deal with this kind of result. There won´t be any single gene or few genes responsible for our illness. A whole balance is off, but the failure might even consist of normal enough behaviour by cells (and symptoms are not specific). In so far Dr Phair will be right, that it´s a stability, but for ill behvaiour (which should be codified as well, right?).

Would it be bad news? I think it would be expected news. What does it mean for influences?

In my opinion there is only one chance:
  • What can the trigger(s) have done. Here we were searching for normal pathways, which even wouldn´t need to be affected by an SNP.
  • How and especially where is PEM possible - the only "symptom" that really sticks out in our illness. I have linked above to what I think PEM must be thought of.

For myself I have found the answere:
  • The iNOS has gotten out of tune, by one element and one amino acid. This asks for a diminished intake.
  • I need to change the concrete places of some metals (in the brain; thalamus and basal ganglia to give a more precise guess). Actions will take place where the metals are, and the metals willbe distributed where actions have been recently.
  • I help those structeres with things wich should be (still) too low or elevated needed (only two amino acids and one metal)
  • I supply those structures with some vitamines (which mustn´t contradict the first point).
  • I help the cells to adjust epigenetically.

Given the fact that (so far) medicals are often worse than supplements in helping us for some, mostly little time, I assume that such thinking will be the best chance to answere a result which the study claims to show.
 
Last edited:

nandixon

Senior Member
Messages
1,092
Likes
3,150
I moved this post from another thread:

Nan.. this seems alarming.. have you or others been in touch w klimas to point out the data issues? Do you know for certain about use of undergrads?
I was so disappointed in Klimas and how badly the interpretation of the data was messed up for this study that I haven't had the desire to contact her.

The data errors are so atrocious that they're indicating that Klimas never even bothered to look at any of the data. (I'm making an assumption here that someone at her level couldn't possibly be so ignorant in their understanding.)

I'd previously been a fan of hers, in particular with respect to her homeostasis theory for Gulf War Illness and ME/CFS, but now I'm not sure that I can trust anything coming out of her group.

Honestly, it makes me think that she doesn't really care about actually finding the underlying cause of ME/CFS.

The sad thing is that a person doesn't even need to have much (or any) understanding of genetics at all to immediately recognize the errors in the data - for example, the authors thinking that a mutation (the first one in Table 1 of the study) that only has a frequency of 0.0006% in the general population could even conceivably be present in 77% of patients!?!?! There are many, many similar errors like that. If the results of this study were true it would have been known exactly what the underlying causes were in ME/CFS decades ago.

I'm certain about the use of undergrads because that's indicated in the Acknowledgements section at the end of the paper:

We would like to thank students of Halmos College of Natural Sciences and Oceanography of Nova Southeastern University Valentina Ramirez, Maria Cash and Pallavi Samudrala for their help with the analysis of data.

It's just a shame that what could have potentially been a very helpful study for a group of people who have suffered for a very long time was done with so little care.
 
Messages
770
Likes
2,009
There is also the issue of the mix of 23andMe v4 and v5 data used in the paper. My personal v5 23andMe file has only 190 SNP's out of top 525 CADD SNPS listed in supplemental table 2. That seems like another problem to me, on top of the ones listed on this thread and elsewhere. Maybe the 335 missing SNPs were taken out of the v4 23andMe chip to make v5 for a reason!
 
Messages
770
Likes
2,009
For people still skeptical - here are the two datapoints I picked on first look at the paper last week.

BCAM rs3810141 translates to hg19 Chr 19 position 45316804. Kaviar notes two different frequencies depending on which mutation is referenced
C Reference
T (7.8074%)
A (0.0006%)
Using tool http://db.systemsbiology.net/kaviar/cgi-pub/Kaviar.pl

The paper mistakenly references 0.0006. The patient population is 10% giving a ratio of x1.28 so this gene does not meet the study criteria of x2 as significant.

MUC19 rs10784618 is shown as having a kaviar frequency of 15.9% which matches the kaviar database. Patient population is 77%, a 5x difference. 1000 genomes has this listed as 48.2% in the general population which is a 1.6x difference. This does not meet the study criteria of a 2x difference.

Here are what the other databases show (100%=1)
https://www.ncbi.nlm.nih.gov/snp/rs10784618
C=0.41899 (60232/143754, GnomAD)
A=0.49402 (62033/125568, TOPMED)
C=0.4825 (14872/30824, GnomAD) (- 5 less)
C=0.3900 (6809/17458, ExAC)
C=0.482 (2412/5008, 1000G)
C=0.398 (1781/4480, Estonian)
C=0.449 (1731/3854, ALSPAC)
C=0.455 (1687/3708, TWINSUK)
This can be found again, and again, and again.

Unfortunately this study was most likely the topic of a talk at todays researcher only BRMEC9 conference, presented to over 100 researchers as real data.
 
Messages
6,084
Likes
14,352
how badly the interpretation of the data was messed up for this study
I hope this not be true. I fear it is possible.

That said: I had data mutilated and mis-analyzed by a PhD who asserted: I am very competent, I will analyze this for you. Never asks a single fundamental question, just assumes: they understood what needed to be analyzed.

I was quite mortified to have to throw out and not use the analysis I was provided.

so: no, I would not compare size of seedlings with juveniles and adults. And then use growth data to infer site preferences. If a single brain neuron had been deployed: one would never do that.

Just because somebody : has training, degrees, does not mean they are thinking.

And a big issue is: You don't know: what you don't know. That t-shirt would make alot of money, and many of us should wear the shirt often!

Its also a basic science problem I experienced repeatedly with "the system" its called- we cannot afford to hire an expert to do this study. We don't have enough funding to hire the competent individual. So we will hire that great: recently graduated from college person, who has no experience whatsover. We got mediocre outcome after mediocre outcome. But heck, they learned alot. .....

So thank you cheap undergrads: maybe not worthy of the thanking. (and I know nothing personally about these data or analysis, so this is just my opinion, GENERALLY).