Doing a Whole Genome Sequencing (WGS)

jamienoble · Dec 2, 2015

Valentijn said:
Yes, the program should be able to handle that.

Thanks, I will let you know when my data is ready, i will do WES (Gene by gene)

Valentijn · Dec 2, 2015

jamienoble said:
Thanks, I will let you know when my data is ready, i will do WES (Gene by gene)

It would also help a lot if you could send me or post a small section of it, so we know exactly how the formatting looks. Or Gene By Gene might have a sample available. Basically we need to know the formatting so that the software knows how to read it.

jamienoble · Dec 3, 2015

Valentijn said:
It would also help a lot if you could send me or post a small section of it, so we know exactly how the formatting looks. Or Gene By Gene might have a sample available. Basically we need to know the formatting so that the software knows how to read it.

Sure i will ask them if they can provide a sample when the data is ready and also if they can tell how the formatting looks like. I will update you how it goes.

SickOfSickness · Dec 6, 2015

jamienoble said:
Thanks, i tried to sign up as well but i think you have to live in America to do this. : (

It seems to be US, Canada, and UK so far. They are also doing preregistration for Austria now. (Not Australia.)

Waverunner said:
I would go for Gene by Gene but I also applied for two other personal genome projects. I think the chances to be accepted are low for the latter projects, however.

Can you tell me which others you signed up for?

I might sign up for PGP to try to get a Veritas sequencing sooner, but they probably have so many PGP participants. I wish I had signed up a while ago.

jamienoble · Jan 4, 2016

Valentijn said:
It would also help a lot if you could send me or post a small section of it, so we know exactly how the formatting looks. Or Gene By Gene might have a sample available. Basically we need to know the formatting so that the software knows how to read it.

Hello again, just got an answer back from GBG and this is what they wrote:

For alignment, we will provide a standard .bam and .bai (index) file.
The variant calling file (VCF) will be in standard VCF v4.1 (see links below)
http://samtools.github.io/hts-specs/VCFv4.1.pdf
http://www.1000genomes.org/wiki/analysis/variant call format/vcf-variant-call-format-version-41

Let me know if you need more.
http://www.1000genomes.org/wiki/analysis/variant call format/vcf-variant-call-format-version-41
Does this information answer the questions you had? I will order the test as soon as you feel like you can help me with this.
Thanks!

Valentijn · Jan 5, 2016

jamienoble said:
Does this information answer the questions you had? I will order the test as soon as you feel like you can help me with this.

Yup, it's a good example of the formatting in the first link.

Regarding the rest, I think we just need to generate the databases. We're currently working on getting the 23andMe version of AnalyzeMyGenes expanded, and have got a pretty good rewrite in the process, to increase efficiency.

After that is done (probably within a week, maybe sooner), we can use the same methods to generate a set of full genome databases which can also run with the AnalyzeMyGenes program. I think the process of extracting that information is pretty automated now, so the only real concern will be the size of the resulting databases.

Basically we'll be trying to get 13GB of compressed data pared down to something much much smaller so that 1) people can download it and 2) it doesn't take hours or days to run. But we haven't done anything of this magnitude before, so we might run into some complications along the way

jamienoble · Jan 7, 2016

Sounds good, for me it doesn't matter if your software can read the results now or in 3 months as long as it works.

I don't know if you are aware of my history but basically i will take this "medicine" which cured me 10 years ago (Accutane) again and before i start with this i will do a bunch of tests because i obviously cant search for my disease through tests during this time (which will take 2 years). Will that be the same with WES or my gene information wont be affected when i take this?
If it wont be affected i can do WES anytime during these 2 years and it will be the same results anyway?
Im also wondering if its possible to look at specific genes with your software?

Valentijn · Jan 8, 2016

jamienoble said:
Will that be the same with WES or my gene information wont be affected when i take this?

It doesn't sound like any drugs are known to interfere with genetic testing. Accutane is suspected of altering telomerase function, but that shouldn't make a difference, since genetic testing involves looking at known DNA sequences in various locations to make sure it's reading everything at the right place.

jamienoble said:
Im also wondering if its possible to look at specific genes with your software?

Yes, the version which is coming out very shortly will have several extra databases in the Downloads section. One of those will be for gene names. If it's saved as a .csv or .txt file, it can then be opened in Excel and easily sorted and/or searched by gene name.

jamienoble · Jan 10, 2016

Thanks for the help i really appreciate it. The only reason i would want to wait is because im curious what http://www.helix.com/ will offer.

I have another question, not related to genetics but maybe you can help anyway.
Isotretinoin affects cytokines, for example it regulates Interleukin 17 (IL-17) and Interleukin 4 (IL-4).
I finally found a lab in Belgium where i can do cytokine test but i don't know which one of them i should choose.

http://media.wix.com/ugd/047c1b_beda2b47dde24242ab2a6c4c9954fd99.pdf

Proinflammatory cytokines or Cytokine RNA, inflammatory? I will do Th1/Th2 cytokines, MMP-9 and IL-17 as well, but the other 2 i don't know which one i should choose or i should do them both because they are very different?

This is what the lab wrote: cytokine expression levels at protein level (CYTS) while the other one (CYTI) is evaluation the expression at mRNA level.

Valentijn · Jan 11, 2016

jamienoble said:
I will do Th1/Th2 cytokines, MMP-9 and IL-17 as well, but the other 2 i don't know which one i should choose or i should do them both because they are very different?

No idea, sorry.

SickOfSickness · Jan 11, 2016

jamienoble said:
On monday i will order mine from here: https://www.scienceexchange.com/labs/clinical-microarray-core-ucla
It will cost $3500 with data analysis

I want to get WGS or WES this year. $3500 for WGS seems better to me than $1500 for WES, because after the WES is bought, you still want WGS later. But WES might be $500 soon.

Also if the WGS analysis only covers single-gene mutations, then I want to wait for their analysis to improve. Is the analysis really that basic? I will have to search. It can't be too bad.

I found this on Wikipedia about WES: "There are many factors that make exome sequencing superior to single gene analysis including the ability to identify mutations in genes that were not tested due to an atypical clinical presentation[14] or the ability to identify clinical cases where mutations from different genes contribute to the different phenotypes in the same patient.[2]"

Also, it seems like it would take time and energy to find a clinical researcher to order WGS.

Valentijn · Jan 11, 2016

SickOfSickness said:
I want to get WGS or WES this year. $3500 for WGS seems better to me than $1500 for WES, because after the WES is bought, you still want WGS later. But WES might be $500 soon.

I'd rather have WES while WGS is still $2,000 more expensive. WES is a much smaller and thus more manageable set of data, and I just don't can't think of any way I could use the non-exome data on a personal level (versus in research).

SickOfSickness · Jan 11, 2016

Valentijn said:
I'd rather have WES while WGS is still $2,000 more expensive. WES is a much smaller and thus more manageable set of data, and I just don't can't think of any way I could use the non-exome data on a personal level (versus in research).

WES is such a small percentage of WGS. I hoped that WGS could catch some genetic problem than WES missed, but the analysis seems lacking.

There are some things I want to know soon. Waiting 1 or 2 more years is too long

Also, I saw a lab saying you had to download your complete WGS data within 2 or 3 weeks. They can't afford to store the data. I'm not sure how much data we're talking about.

Valentijn · Jan 12, 2016

SickOfSickness said:
I'm not sure how much data we're talking about.

One source online says about 3GB. But it might vary a lot depending on how much data is included (SNPs, variants, frequencies, etc). The service you're thinking about buying it from should be able to give a very accurate estimate.

jamienoble · Jan 12, 2016

SickOfSickness said:
I want to get WGS or WES this year. $3500 for WGS seems better to me than $1500 for WES, because after the WES is bought, you still want WGS later. But WES might be $500 soon.

If you are interested you can e-mail the Assistant Lab Director Ling Dong: LingDong@mednet.ucla.edu
I didn't do it because as Valenteijn said, WES seems to be the better choice for now.

RYO · Jan 21, 2016

Valentijn said:
It doesn't sound like any drugs are known to interfere with genetic testing. Accutane is suspected of altering telomerase function, but that shouldn't make a difference, since genetic testing involves looking at known DNA sequences in various locations to make sure it's reading everything at the right place.

Yes, the version which is coming out very shortly will have several extra databases in the Downloads section. One of those will be for gene names. If it's saved as a .csv or .txt file, it can then be opened in Excel and easily sorted and/or searched by gene name.

I recently had WES at local University/Medical center as part of a study. They performed limited analysis (targeted informatics analysis/Focsued Diagnostic Panel in addition to high level variant detection. I am not sure whether they would give me a raw data file but can you tell me more about your software. Is it software that can perform diagnostic analysis?

Valentijn · Jan 22, 2016

RYO said:
I am not sure whether they would give me a raw data file but can you tell me more about your software. Is it software that can perform diagnostic analysis?

Basically the new version can label genes, pull out and/or flag rarer SNP alleles, flag missense mutations, flag known pathogenic SNPs, and calculate BLOSUM62 scores (4 to -4, with 4 being no difference and -4 being the most drastic) to help to predict if a missense mutation is pathogenic.

It's not a complete analysis, however. There's still a need to read the research (usually summarized on OMIM) to see if a "pathogenic" SNP is actually disease causing, versus causing red hair and pale skin, for example. And since there is no research into a lot of missense mutations thus far, it might be helpful to read about the genes on which your very rare missense mutations are located in, to determine if it's something which is reasonable to look into further.

The old version only pulled out rare results, so it was pretty time-consuming to look up each SNP to see what gene it was on, if it might relevant, if it might be pathogenic, etc. The new version does all of that automatically, which is a huge time-saver. It's also a lot more feasible now to focus on specific genes which are more likely to be responsible for certain symptoms.

And finding potential compound heterozygous missense mutations is now much much easier. Basically there is a 1% database of the rarest SNPs, but that can miss SNPs which are somewhat common and harmless when heterozygous, and pathological when homozygous. So there is also a 10% database to look for those homozygous mutations, but it's too huge and spammy to be useful in looking for heterozygous mutations, and that meant that problematic compound heterozygous mutations could get lost in the spam.

For example, from the 23andMe V3 chip (the big one which had 960,000 SNPs), I get 18,000 results where I have an allele that is present in 10% or less of the general population. It would take months or years to look through all of those by hand, but I could at least focus on the homozygous rare SNPs, of which I have 934.

Now I can also focus on the missense mutations, which is really where all the action is. By sorting in Excel, based on if there's a missense mutation, I can see that I have 322 missense mutations (both homozygous and heterozygous). And by sorting just those results based on gene name, I can see if there are multiple mutations (my compound heterozygous suspects) on the same gene. Then I can select the "genes" column and tell Excel to highlight the duplicates, and even sort those gene names based on the highlighting to put all of the multiple hits for the same genes at the top of the list. That leaves me with 57 results for about 25 genes, which is quite easy to look into further ... compared to the ~17,000 heterozygous results I started out with.

Here's an example of what I'm looking at currently:

Everything in the spreadsheet is automatically generated, except a couple comments which I've entered in the last column, and the highlighting. I haven't looked into most of these yet, since I'm doing it mostly to run tests at the moment, and will have better 1% and 10% databases soon which might make my efforts a bit redundant if I'll be doing it again shortly with the new database.

But a fun thing I found is that I have two heterozygous missense mutations on the GALT gene. The more common one, labeled here as GALT N138D (MAF 5%) reduces my lactose processing to about 75% of normal ... unless there is the even rarer GALT L42V present, which makes my lactose processing even more efficient than normal when specifically combined with the first mutation

It would have taken me months to find these mutations without the automatic labeling of genes and flagging of missense mutations.

roller · Jan 22, 2016

did i get this wrong (reading somewhere on the net) or are humans indeed to 99% genetical identic?

Valentijn · Jan 22, 2016

roller said:
did i get this wrong (reading somewhere on the net) or are humans indeed to 99% genetical identic?

Yes, 99.5% identical, on average.

roller · Jan 22, 2016

thanks

how likely is it that minor deviations in those 0.5% make us crawl through this world?
(if we can, at all)

rhetorical question, with sight on genetic testings.

Doing a Whole Genome Sequencing (WGS)

jamienoble

Valentijn

Senior Member

jamienoble

SickOfSickness

jamienoble

Valentijn

Senior Member

jamienoble

Valentijn

Senior Member

jamienoble

Valentijn

Senior Member

SickOfSickness

Valentijn

Senior Member

SickOfSickness

Valentijn

Senior Member

jamienoble

RYO

Senior Member

Valentijn

Senior Member

roller

wiggle jiggle

Valentijn

Senior Member

roller

wiggle jiggle