• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of and finding treatments for complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia (FM), long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

Importance of ethnicity and allele frequency in determining disease-causing SNPs

nandixon

Senior Member
Messages
1,092
Although the frequency of most SNPs are similar across ethnic groups, there are many SNPs for which the allelic frequency is significantly different between different ethnic groups.

An SNP that is essentially benign in one ethnic group can be very problematic in another. This can be due, for example, to differences in the frequencies of other SNPs but also epigenetic differences, including differences in methylation due to diet, for example, among numerous other epigenetic possibilities.

Those epigenetic differences are critical and can easily be as important - or more important - than the SNPs themselves, since they can, for example, override an SNP that causes an undesirable loss or gain of enzymatic activity by, for example, increasing or decreasing the expression of the gene in question, thus negating the deleterious effect of the SNP - and making an SNP that causes problems in one ethnic group of less significance in another.

This means that when trying to discover SNPs that may cause or contribute to an illness - especially one like ME/CFS - the appropriate ethnic database of allelic frequencies should be used for comparison with the patient, i.e., a European database for Europeans, an Asian database for Asians, etc.

If this isn't done there is a significant risk of false negative results and futile searching, because: We simply do not know whether the SNPs (if they exist) that may contribute to ME/CFS are ones that fall into the category of SNPs that have a similar allelic frequency across ethnic groups OR into the category of SNPs that have a different allelic frequency across ethnic groups. (Obviously, more false positives will be generated using the appropriate ethnic database and this needs to be considered.)

Importantly, the largest differences in allelic frequencies seem to occur between European and Asian ethnicities.

Thus, when attempting to look for disease-causing SNPs in the largely European or European-descended members of this forum, using the (full) data from the 1000 Genomes Project, for example, is especially problematic - i.e., prone to false negative results - because that database contains a majority of Asian DNA samples. (The percentage of European DNA in that database is only about 20%, the last time I looked.)

If an SNP or SNPs has the potential to cause or contribute to ME/CFS in Europeans, for example, that SNP may be missed when using 1000 Genomes data if it is benign in Asians (for example) yet present at a higher frequency in the Asian population - since it will appear to be more common than it actually is in Europeans. And vice versa, of course.

In summary, the scientific literature is clear that ethnicity can be a critical factor when attempting to narrow down what potential SNP or SNPs might cause or contribute to an illness, and that an ethnically appropriate genetic database should be used to avoid false negative results for SNPs that may be disease-causing/contributing and of lower frequency in one ethnic group but benign and of higher frequency in other ethnic groups.

There are dozens of research articles that have been published that emphasise this. Below are just a handful:
 
Last edited:

nandixon

Senior Member
Messages
1,092
1. Distinctive Frequencies of +874T/A IFN-γ Gene Polymorphism in a Healthy Serbian Population.
"... [T]he susceptibility to and outcome of certain diseases can vary in different ethnic populations partially due to the notable differences in frequencies of genotypes and alleles between them... Analysis of genotype and allele frequencies for IFN-γ +874T/A SNP in healthy subjects revealed, for the first time, the genetic profile for this polymorphism in a Serbian population resembling most European populations, but differing from some Asian and African ethnic groups."
http://www.ncbi.nlm.nih.gov/pubmed/23253667/

2. Comprehensive variant screening of the UGT gene family.
"For personalized drug treatment, it is important to study differences in the frequency of core markers across various ethnic groups... We observed that the frequencies of UGT1A1... were different between Asian and other ethnic groups."
http://www.ncbi.nlm.nih.gov/pubmed/24339312/

3. Differences in asthma genetics between Chinese and other populations.
"Most asthma genes are not replicable across populations, which is possibly because of differences in the epidemiology of these genes. Our case-control association and next-generation sequencing studies revealed substantial discrepancies in the frequencies of single nucleotide polymorphisms (SNPs) and haplotype blocks for asthma genes between Chinese and other populations. The minor allele frequencies for nearly half of our studied SNPs differed by 0.2 or greater between southern Chinese subjects in Hong Kong and European white populations, African populations, or both... [D]ifferences in allele frequencies of asthma genes and haplotype structures of asthma loci are found between Chinese subjects and other ethnic groups. These sequence variations must be considered during the selection of tagging SNPs for replicating genetic associations between populations."
http://www.ncbi.nlm.nih.gov/pubmed/24188974/

4. Screening of Genetic Polymorphisms of CYP3A4 and CYP3A5 Genes.
"The results showed that minor allele frequencies of Korean population were similar with those of the Japanese and Han Chinese populations, whereas there were distinct differences from European-Americans or African-Americans..."
http://www.ncbi.nlm.nih.gov/pubmed/24381495/

5. A common variant in SLC8A1 is associated with the duration of the electrocardiographic QT interval.
"The minor allele frequency (MAF) of rs13017846 varies widely between ethnicities - 0.053 in Europeans... versus 0.080 in Africans... whereas a MAF of 0.500 has been reported in Asians... This might explain why this locus has not been identified in Europeans in previous studies."
http://www.ncbi.nlm.nih.gov/pubmed/22726844/
 
Last edited:

Critterina

Senior Member
Messages
1,238
Location
Arizona, USA
...and then you throw in someone like me, whose maternal relatives came from east Asia to the Americas (north, meso, and south) in pre-Columbian times and then ended up somehow in Europe, and whose paternal relatives apparently danced all the way around the north pole, but whose Y-chromosome apparently came to Norway from the Ukraine...and then there's some pygmy and Persian thrown in for good measure...

For someone who just thought she was "northern European", then finds a rare homozygous SNP that's only identified with a population called "Tamil" or something equally unlikely...it's not just hard to figure out why I'm sick...it's really opened up the question of who I am!

Thanks, @nandixon ! This is a really good thread...but I'm not sure what to do with it. :)
 

nandixon

Senior Member
Messages
1,092
@Critterina
I'm curious what your results are using 23andMe's ancestry analyzer here: https://www.23andme.com/you/ancestry/composition/

For example, mine shows 99.9% European (including 57.4% British & Irish, 12.6% French & German, and 4% Scandinavian).

I'm not sure how accurate that analyzer is, but if it shows you have 50% or greater European ethnicity, then I think I would use a European-derived genetic data set (see below) for comparing SNP minor allele frequencies (MAFs).

On the other hand, if it shows you have a truly mixed ethnic background, with Asian, Americas and European ancestry - and especially if the Asian ethnicity predominates - then I think I'd use the 1000 Genomes data set since that data set includes all those ancestries and is heavily Asian-derived.

Like I mentioned in the original post, for people with more clearly defined ancestries, the specific ethnic-appropriate data set should be used.

So, for example, people with European ethnicity should use, e.g., HapMap-CEU data or OpenSNP (which seems to be mostly European-type ethnicity data based on how closely its MAF results match those of HapMap-CEU). One can also consult ethnic-appropriate study results for the particular SNP in question.

The 1000 Genomes data contains too high a percentage of Asian ethnicity to be reliable for European-descended individuals, i.e., for many SNPs the MAF difference between 1000 Genomes and HapMap-CEU (or OpenSNP) is too great and can obscure potentially problematic SNPs, i.e., cause false negative results.

Like I mentioned previously, though, epigenetics, both inherited and environmental, plays a hugely important role, and any supposedly "bad" SNP may or may not be of consequence depending on the individual.
 
Last edited:

Critterina

Senior Member
Messages
1,238
Location
Arizona, USA
@nandixon

We consider ourselves Scandanavian, and 4th generation Californian. It wasn't until my mom showed me a clipping from a 19th century San Francisco newspaper that I knew her grandmother's mother came from Switzerland. There was also an Italian somewhere back there; when we asked how much, we were told it was half the nail on our little finger. Although our mothers called their daughters some variation on "My little (adjective) Indian", (e.g. "wild" or "blackfoot"), I never suspected it meant anything more than that we needed to calm down or wash our dirty bare feet. Perhaps there is a secret hiding in those words.

My maternal haplogroup is reported by 23andMe as D4e1. I was sure that was a mistake until my mom's data matched mine. It made me related to people in east Asia, Siberia, Alaska, and the west coast of the Americas. When I came across the "Tamil" for an SNP, I was using the HapMap. That's where I did not match the CEU people at all. I had several experiences like that - that whatever SNP I as looking at didn't show up in the CEU people with any frequency at all. And the groups where the SNP appeared - some I'd never heard of. The 23andMe genetic analyzer showed me as mostly CEU, but there were definite unexpected components, too, like Native American. I knew all my ancestors, personally, whose parents came from Europe, and all since the 1830s. (We are a late-bearing and long-lived bunch.) There's not much doubt about coming FROM Europe to north America: who, when, and where from/where to.

Later, I used GEDmatch to examine the ethnicities in my genome. It showed that I have D4e1 plus an extra gene that would suggest D2, but there was also a disconnect there, so I'm not entirely one or the other. It also showed specific north, meso, and south American contributions. The gene-coloring program is addicting - showing which parts of which chromosomes come from which of about 15 ethnicities. It is mostly European, but less than I imagined. We trace my maternal upline to Switzerland, but how it got there is the mystery. My mom was a towhead blonde with pale brown eyes and light olive skin that tanned dark, and at 92 she still has a lot of color to her skin. GEDmatch seemed to be able to do a much more finely tuned job of associating various genes with ethnicities. Things like Saami, Komi, Finn - all arctic circle European (my dad's side). Maybe it's my mistake to equate precision with accuracy.

It seems my lot in life to never quite fit the profile, whether I'm buying makeup to match my skin or reporting symptoms to my doctor. Seeing the different sources of my genome lets me allow that it might be an unusual combination of genes that enables this experience.
 

nandixon

Senior Member
Messages
1,092
@Critterina

"The 23andMe genetic analyzer showed me as mostly CEU..."

If you don't mind me asking, what is the actual percentage number that the 23andMe analyzer gave for your total European ancestry?
 

Critterina

Senior Member
Messages
1,238
Location
Arizona, USA
Oh I had to log in to look at the 23and Me results. I see it's changed since I last looked:
It used to say 99.8% "European" - not CEU, 0.1% East Asian 0.1% Native American.
I thought how interesting it was to be D4e1 and such low percentages of the sources of D4.

Now it says:

99.6% European, 0.3% East Asian/Native American on the speculative view: 51.5% Scandanavian, 34.7% Broadly Northern European, 1.9% Broadly European, 0.1% Yakut, 0.1% Native American <0.1% Broadly East Asian. I did the math and 0.1% is only 10 generations ago. I know these are only traces, but since I know five generations in North America, the traces are really surprising. So maybe 5 generations in Europe also?

97.6% European, the rest unassigned, on the conservative view.

What do you know about GEDmatch? It pegs certain genes to certain ethnic groups and then looks for a pattern...that's about the sum total of what I know.