• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of and finding treatments for complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia (FM), long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

The Power and Pitfalls of Omics: George Davey Smith’s storming talk at ME/CFS conference

Simon submitted a new blog post:

The Power and Pitfalls of Omics: George Davey Smith’s storming talk at ME/CFS conference

Read about the talk that stole the show at a recent ME/CFS conference in Simon McGrath's two-part blog.

Here's Part 1:



George Davey Smith

Last November, science star Professor George Davey Smith gave a talk at the UK CFS/ME Research Collaborative (CMRC) Annual Science Conference that focused on bigger, better, smarter approaches to research.

Since then, Davey Smith has said he’s keen to play a role in the largest set of studies ever proposed for ME/CFS: Professor Stephen Holgate’s Grand Challenge, which is now moving forward.

The plans are for a 'big data' study using a huge cohort that could be 10,000 patients strong. The total budget for this project is likely to be well over £5 million.

While not all of the points he made in his November talk are currently applicable to ME/CFS research, Davey Smith outlined the power and pitfalls of the types of genomics and other big-data approaches that the Grand Challenge is now planning to apply to ME/CFS research.

He showed how, surprisingly, large-scale genetic data can also be used to study non-genetic potential causes of illness, such as the link between vitamins and heart disease. Throughout, he emphasised how ingenious and rigorous approaches can tease out the true causes of disease from misleading, empty clues. His remarks also showed the mind of a top-notch scientist at work, one who has been playing in the biggest of scientific sandboxes.

Davey Smith doesn't have a high public profile, but he's published over one thousand studies, and has a huge reputation among fellow scientists. Few researchers are as widely cited by their peers: he has a citation h-index of 150, which ranks him alongside the top life scientists in the world. Much of his work has focused on better ways of doing science, and he’s also played a major role in many large population studies.

Davey Smith's pioneering approaches and studies have done a great deal to sort out what's good advice to help you live longer and in better health (sadly even low-level alcohol drinking is harmful) and what's not (certain vitamins can reduce the risk of heart problems).


Off-beat scientist


Davey Smith (right) with pen and a pint

Davey Smith gives the firm impression of being entertained by life. The close-cropped hair, jeans and T-shirt mark him out as a little different, not to mention the liberal use of cartoons and jokes in his presentation — and few others would include slides of their favourite take-away restaurant in a conference talk, as he did.

When Davey Smith was a medical student in Cambridge, he bunked off for a cycling holiday with his girlfriend while he was supposed to be learning about epidemiology.

He got back just in time for a tutorial on epidemiology from his mates in the pub, before excelling in the exam the next day. Perhaps that’s why he was captured by the subject, which studies the pattern of disease: who gets ill, when, where, and most importantly, why.

(More on Davey Smith).

Don’t be fooled by the relaxed attitude, though. Davey Smith is one of the most highly-rated life scientists in the world.



He doesn’t know the disease – but he does know good research

He began his CMRC talk by showing a photo of a boy in a dunce’s hat and declaring, ‘I know nothing about this illness’. But he does know about genomics research, and on that basis, he started off in a typical style for him: breaking a few eggs in pursuit of the right answer.

He showed four recent ME/CFS studies identifying links between differences in some genes and the illness. Davey Smith was not impressed.

‘The statistical power [to] is literally zero’, he said. These studies were simply too small to show anything, and any apparent findings are effectively guaranteed to be false positives — that is, associations simply happening by chance.

‘ME/CFS gene studies today’, said Davey Smith, ‘appear much like other gene association studies a decade ago — hopelessly unreliable.’

Davey Smith and his colleagues helped to dramatically reduce such problems more than a decade ago. They wrote a paper for The Lancet ‘that didn't make us enormously popular’, pointing out that almost all published association studies up until 2002, including those published in The Lancet, were proving unreliable.

They argued that the false associations were showing up primarily because of publication bias (only studies that found an association got published, while those that did not were ignored), way-too-small sample sizes, and poor statistical techniques.

As a result of that 2003 paper, things changed. Funders such as the Wellcome Trust took the lead and refused to finance further studies unless researchers collaborated to create studies that were big enough to give reliable results.

The upshot was that huge numbers of genetic variants discovered since 2005 have stood the test of time, while almost all of the associations people found before ‘have now just gone’.

Researchers can now search through more than 10 thousand robustly established genetic associations with disease, including obesity, diabetes and heart disease. But currently there are none for ME/CFS.

The Grand Challenge is poised to change that.

Davey Smith then revealed how smart genomics-based approaches can even identify how non-genetic factors, such as diet, do (or don't) cause disease — overcoming problems faced by more traditional techniques.

The biggest problem: Correlation is not causation!


If you judge by the nutritional advice you read in the newspapers, you might conclude that scientists don’t know much. For example, Davy Smith pointed to these two headlines about the impact of eggs on diabetes:

Eating eggs raises the risk of diabetes — Daily Mail

Eat eggs to slash risk of diabetes — Daily Express

The problem here isn’t just confused journalists — it’s confused research. And Davey Smith explained that often it’s because scientists have forgotten one of the great mantras of science: Correlation is Not Causation.


Cartoon via ILLuMiNuTTI.com

Correlation simply means that when A happens, B tends to happen as well. That’s a first step to show that A causes B — but sometimes, it’s a step on the path to a dead end.

Challenge 1: Getting Confused by Confounding

Davey Smith showed one way that scientists can mistake correlation for causation, by focusing the example of the supposed link between vitamin E and coronary heart disease.

Numerous studies had shown that people who took more vitamin E supplements, as well as those who actually had more vitamin E in their blood, were less likely to develop coronary heart disease. And the effect was potentially enormous — 40% less in one study! Vitamin E appeared to be a fantastic, inexpensive way to help control humanity’s biggest killer.

Researchers took one additional step to confirm their finding: they ran science’s ‘gold standard’ of randomised controlled trials. Previous studies had been ‘observational’: researchers had simply observed that people who chose to take vitamin E supplements were less likely to develop coronary heart disease.

In randomized controlled trials, on the other hand, researchers randomly put people in two groups, and then they gave one group a vitamin E supplement and the other a placebo. But they found that the two groups fared just the same. Vitamin E had no effect on the rate of coronary heart disease.

How can that be?

Simple confounding example


Coffee drinkers tend to have higher rates of pancreatic cancer than non-coffee drinkers, but the coffee itself has nothing to do with it.

It turns out that coffee drinkers are more likely to smoke — and smoking indeed causes cancer.

So smoking is called a 'confounder' for the relationship between coffee and pancreatic cancer.

CoffeeConfound.png

(From Mann & Wood:
confounding in observational studies)



Davey Smith and others argued that the problem was that people who chose to take vitamin E supplements were also more likely to take exercise, smoke less and have a low-fat diet, for instance. And while vitamin E itself has no impact on heart disease — exercise, smoking and diet do. Vitamin E was just a passenger along for the ride, a buddy of the real drivers of health changes.

So Vitamin E is correlated with increased exercise, decreased smoking, and a better diet — and it’s also correlated with reduced heart disease. But vitamin E doesn’t cause reduced heart disease. The hidden, true causes that confuse matters — in this case, exercise, smoking and diet (amongst others) — are called ‘confounding factors’.

Challenge 2: The Curious Problem of ‘Reverse Causation’

A second problem in observational studies is ‘reverse causation’. Consider the bizarre finding that ex-smokers are more likely to die of the lung disease emphysema than smokers. Could it really be that quitting smoking increases your risk of emphysema?

No. This finding only shows a correlation between quitting smoking and emphysema, not causation. What’s going on instead is that smokers become ex-smokers when they’ve been diagnosed with emphysema. Thus the causation is running in the opposite direction. The emphysema is causing the quitting, rather than the quitting causing the emphysema.

In this case, the problem isn’t so hard to spot. But it has led to lots of false findings.

Solution: Nature’s own randomized trials

Randomized controlled trials are the very best way to solve these problems, but they come with their own difficulties. They are terribly slow and expensive.

Suppose, for example, you want to test the strong observational finding that vitamin C lowers the risk of heart disease. To do so with a randomized controlled trial, you’d need to recruit many thousands of people, give half of them vitamin C and half of them a placebo, and then track them over many years to see how many in each group developed heart disease.

A clever new method called ‘Mendelian randomization’ offers many of the benefits of a randomized control trial, and it generally does so far faster and more cheaply. The key is that it lets nature do the ‘randomization.’

Davey Smith, who has championed, developed and used Mendelian randomization to great effect, explained how it works using vitamin C, which, like vitamin E, seemed to reduce the risk of coronary heart disease.

Mendelian randomization is possible because gene differences help predict vitamin C levels. Even if two people consume the same amount of the vitamin, their blood levels of the vitamin may be significantly different — because differences in their genes may make them more or less effective at absorbing vitamin C from their gut.

Crucially, which genes you are born with are, of course, unrelated to potential confounding factors such as exercise, smoking, diet, and income level later in life.

Just as important, if you sort people by the genes for vitamin C absorption, a whole bunch of other genes won’t tag along for the ride (which would introduce new confounding problems). That is how the method gets its odd name — from Gregor Mendel, the father of genetics, who showed that, in almost all situations, genes are inherited independently.

This leads to a new way to study the impact of vitamin C on heart disease.

Test the genes of thousands of people and use gene differences to sort them into lower vitamin C and higher vitamin C groups. (Testing shows that groups do indeed, on average, have higher and lower levels of vitamin C, as predicted). Then you can simply check to see how many people in each group have developed heart disease.

It’s a dream setup, a better way for testing the effect of any factor where there are natural variations due to the genes we inherit, and one that avoids the pitfalls of reverse causation and confounding.

Such genetic testing has become remarkably inexpensive. And even better, large databases of such genetic information already exist, ready for scientists to analyse with no additional cost to collect the data.

Researchers used a Mendelian randomization with 100,000 people — and found that vitamin C had no effect on heart disease. Heart disease is such a big killer (and vitamin C such a cheap and promising treatment) that expensive randomized controlled trials were done as well, and they came to the same conclusion.

So, as for vitamin E — another antioxidant — good evidence suggests that vitamin C has fewer health benefits than previously thought:


Cartoon by Bruce Eric Kaplan, available from AllPosters.co.uk
Mendelian randomization champions vitamin D for multiple sclerosis

For a long time researchers suspected that vitamin D played a role in multiple sclerosis, not least because of the ‘sunlight effect.’ We need sunlight on our skin to make vitamin D, so people who live closer to the equator, and hence get more sun, tend to have higher vitamin D levels.

Studies had also shown that the closer people lived to the equator as children, the lower their risk of multiple sclerosis.

But such studies only show correlation, not causation. And for Davey Smith, alarm bells went off: ‘I was very sceptical about a causal association’, he said. His guess was that the finding was confounded by many other differences between those who grow up near the equator and those who live nearer the poles.

But Davey Smith was part of a group who ran a large Mendelian randomization for vitamin D in MS. They used genetic variations linked to the production and degradation of vitamin D to sort patients into those with higher and lower levels of the vitamin. And, in fact, the group found that higher vitamin D levels do significantly and substantially reduce the risk of multiple sclerosis.

The multiple sclerosis study highlights another big benefit of Mendelian randomization: researchers can often use existing from previous genomic studies, and as in this case, don’t need to recruit new patients.

The market votes for Mendelian randomization

A few weeks before Davey Smith gave his talk, shares of the pharmaceutical giant Eli Lilly plunged 8 percent as it pulled the plug on a drug that aimed to reduce heart attacks by boosting ‘good cholesterol’ (high density lipoprotein).

The failure is expected to cost Eli Lilly $90 million. Two other pharmaceutical companies have expensive failures of similar drugs, and another paid $300 million just for the rights to another similar drug.

All these firms were pursuing bets based on observational data that more ‘good cholesterol’ was linked to fewer heart attacks. Yet Mendelian randomization studies, free of confounders and risk of reverse causation, had shown no benefit for good cholesterol.

Similar expensive bets on drugs targeting C-reactive protein failed, too. Mendelian randomization was right in this case too, while the observational findings were wrong.

Not surprisingly, Mendelian randomization is now of great interest to Big Pharma, and I suspect Davey Smith is getting a lot of requests to do consultancy in this field.

What large-scale genetic studies can do for ME/CFS

ME/CFS is unlikely to have the types of environmental causes that Mendelian randomization can easily detect. But gene association studies could identify differences in genes that increase the chance of developing ME/CFS, giving clues about what causes the illness.

In addition, gene association studies can identify genetic variations that influence getting better or worse in people who already have a disease. This calls for a second type of study. Instead of comparing healthy people to those with the disease, it would instead track a large cohort of patients and look at the genes (or other factors such as diet) associated with relapse or remission over time.


Omics is nothing without huge samples

But Davey Smith stressed that we learn ‘nothing at all if the sample size is too small’. Many thousands are needed as a minimum. The Grand Challenge, aiming to collect data from 10,000 patients, is the first time such a thing has been attempted for this disease.


The first big meeting to thrash out the details of how the Grand Challenge will work, in preparation for the all-important grant application, comes in April. The Wellcome Trust, which has grants tailored to this kind of project, has made it clear that they are very open to an application on ME/CFS.

With its huge cohort and rigorous approach, the Grand Challenge aims to bring in top talent from other fields to work on ME/CFS for the first time. Davey Smith signing up would be the perfect way to start.

The Grand Challenge will consider a whole range of omics techniques, including large-scale gene expression and epigenetic studies, which GDS briefly covers in part 2

Disclaimer: This two-part blog is my take on Professor Davey Smith’s excellent talk, based on the YouTube video. The official summary of his talk, written by Emily Beardall with Action for ME, and approved by Davey Smith, is available to download here.


Simon McGrath tweets on ME/CFS research:

Support Phoenix Rising

Phoenix Rising is a registered 501 c.(3) non profit. We support ME/CFS patients through rigorous reporting, reliable information, effective advocacy and the provision of online services which empower patients and help them to cope with their isolation.

You can support Phoenix Rising's efforts at no cost to yourself as you shop online! To find out more, visit Phoenix Rising's Donate page by clicking the button below.

Donate



Continue reading the Original Blog Post
 
Last edited:
Thanks Simon. What do you reckon to reading this now, or waiting for part 2 and reading them together?
Part 2 is in draft, suggest you give this ago first, it makes sense in it's own right. Hopefully

And thanks, @Esther12, @Sasha and @Invisible Woman

Intriguing stuff Simon. I'm now keen to watch the video (consistency, hobgoblins and all that).
The sound quality is dreadful, I'm afraid, one reason this is so long after the event.
 
What a coincidence, notification of this post came through just I was just reading 'Proof Positive?' Extract re. George Davey Smith:

The one dissenting voice at the conference (held at Novartis Foundation in London on 31st October and 1st November 2002) was that of George Davey Smith,

Professor of Clinical Epidemiology, Department of Social Medicine,

University of Bristol, who in a presentation called "The biopsychosocial

approach: a note of caution" carried the torch for intellectual integrity.

His contribution showed that bias can generate spurious findings and that

when interventional studies to examine the efficacy of a psychosocial

approach have been used, the results have been disappointing.

To quote from Davey Smith's contribution: "Over the past 50 years many

psychosocial factors have been proposed and accepted as important

aetiological agents for particular diseases and then they have quietly been

dropped from consideration and discussion". The illustrations he cited

included cholera, pellagra, asthma and peptic ulcer. He went on to quote

Susan Sontag's well-known dictum: "Theories that diseases are caused by

mental state and can be cured by willpower are always an index of how much

is not understood about the physical basis of the disease" (Illness as a

metaphor. New York: Random House; 1978).

(End extract) - http://www.meactionuk.org.uk/PROOF_POSITIVE.htm

Looking forward to Part Two Simon, thank you.
 
I'm now keen to watch the video (consistency, hobgoblins and all that)....Yep - I skipped to the AfME written summary which was a little sparse.
Find any hobgoblins?
I was slated to do the official write up but was too ill, and Emily Beardall stepped in: she wrote a good, accurate summary but it's impossible to do justice to so much heavy-duty science within the word limit available.

The one dissenting voice at the conference (held at Novartis Foundation in London on 31st October and 1st November 2002) was that of George Davey Smith
You might be interested in this. Davey Smith is not afraid to say when he thinks researchers have got it wrong.

how does one determine how big a data set needs to be inorder for a study to be statistically valid?
That will be one of the major questions for the Grand Challenge researchers. The problem with genetic studies generally is that you are looking for lots of small, subtle effects, and you need very big samples to detect such small effects against the background noise of randon genetic differences. But they seem to be working on the basis of needing 10,000 patients.

@Simon, wonderful piece of writing. clear and understandable.. thank you.
And thank you!
 
A well-written article, as usual, Simon. This is the first I´ve heard of this guy. He seems like a good guy to have involved with genetic studies of ME, but it is hard to tell from this summary of his talk. It reminds me a bit of the talks that Lipkin gives, in that it seems to be for the layman. Having said that, I am a layman, and I already understood the fallacies in ´challenge 1´ and ´challenge 2.´

Re: mendellian randomization, how do they control for the fact that the alleles for high Vitamin C absorbtion (for example) may associate with the ´good´ alleles for some other gene that reduces the risk of heart disease?
 
A well-written article, as usual, Simon
Thanks

It reminds me a bit of the talks that Lipkin gives, in that it seems to be for the layman
Um, took a lot of graft to give that impression (likewise my write up of Ian Lipkin's talk)! It most certainly wasn't aimed at a lay audience, and wasn't presented in such simplistic terms at all, though I believe the blog is true to the essence what Davey Smith said. I've tried to make it accessible to as wide an audience as possible, without dumbing down the content (though I have left out the more complex examples eg I use a simple example of coffee and cancer confounding; Davey Smith used a more complex one with C-reactive protein and involving reverse causation too).

I'd recommend this is you want something closer to what he actually said:
The official summary of his talk, written by Emily Beardall with Action for ME, and approved by Davey Smith, is available to download here.

Re: mendellian randomization, how do they control for the fact that the alleles for high Vitamin C absorbtion (for example) may associate with the ´good´ alleles for some other gene that reduces the risk of heart disease?
Good question: that's the all important randomisation bit:
Just as important, if you sort people by the genes for vitamin C absorption, a whole bunch of other genes won’t tag along for the ride (which would introduce new confounding problems). That is how the method gets its odd name — from Gregor Mendel, the father of genetics, who showed that, in almost all situations, genes are inherited independently.
Thanks to meiosis (random recombination of chromsomes to produce the haploid sperm or egg) different genes get scrambed between generations There's a great quote about this from the statistician R A Fisher:
"Genetics is indeed in a peculiarly favoured condition in that Providence has shielded the geneticist from many of the difficulties of a reliably controlled comparison. The different genotypes possible from the same mating have been beautifully randomised by the meiotic process. A more perfect control of conditions is scarcely possible, than that of different genotypes appearing in the same litter." --R.A. Fisher[2]
 
Re: for the layman, you obviously have a real knack for making things accessible!

Thanks for pointing that sentence, I must have missed it. It doesn´t make sense to me though - isn´t it true that the alleles for blue eyes associate with those for blond hair? It seems to me that it isn´t even necessary for the genes to be on the same bit of chromosome, there only has to be a correlation between the two alleles on the two different genes.

I will have a look at the full summary tomorrow, seems like interesting stuff.
 
Last edited:
... It doesn´t make sense to me though - isn´t it true that the alleles for blue eyes associate with those for blond hair? It seems to me that it isn´t even necessary for the genes to be on the same bit of chromosome, there only has to be a correlation between the two alleles on the two different genes.
I believe this is what I was referring to: https://en.wikipedia.org/wiki/Linkage_disequilibrium

I notice that the KDM gene study controlled for this.
Yes, when genes are inherited together (or not completely independently) it's called linkage disequilibrium - genetics isn't strong on snappy names - and in those cases Mendelian randomization won't work. But in most cases genes aren't in linkage disequilibrium.

But for the kind of genomic study done by KDM, you do need to allow for linkage disequilibrium when you correct for multiple comparisons (you need to corrrect less where genes are in disequilibrium) so it's important to factor that in too. It's particularly an issue when a genetic association study focuses on a small number of genes, using many SNPs per gene as most of the SNPs within a gene will be linkage disequilibrium (not all mecfs gene studies do this by any means).

[As you can see, writing opaquely comes far more naturally to me...]
 
Linkage disequilibrium also seems to be pretty limited. It'll happen a lot on the same gene, and even among multiple genes which are very close together on the same chromosome. But I don't think SNPs on different chromosome ever behave that way, nor SNPs which are far apart on the same chromosome.

Regarding eye and hair color, both are affected by the amount of pigment (melanin). And there are genes which can contribute more or less pigment to both hair and eyes, such as OCA2.
 
Yes, when genes are inherited together (or not completely independently) it's called linkage disequilibrium - genetics isn't strong on snappy names - and in those cases Mendelian randomization won't work. But in most cases genes aren't in linkage disequilibrium.

But for the kind of genomic study done by KDM, you do need to allow for linkage disequilibrium when you correct for multiple comparisons (you need to corrrect less where genes are in disequilibrium) so it's important to factor that in too. It's particularly an issue when a genetic association study focuses on a small number of genes, using many SNPs per gene as most of the SNPs within a gene will be linkage disequilibrium (not all mecfs gene studies do this by any means).

[As you can see, writing opaquely comes far more naturally to me...]

The opaque stuff actually made more sense to me!