• Welcome to Phoenix Rising!

    Created in 2008, Phoenix Rising is the largest and oldest forum dedicated to furthering the understanding of, and finding treatments for, complex chronic illnesses such as chronic fatigue syndrome (ME/CFS), fibromyalgia, long COVID, postural orthostatic tachycardia syndrome (POTS), mast cell activation syndrome (MCAS), and allied diseases.

    To become a member, simply click the Register button at the top right.

Big Data App to Explore Genomes for Clinical Relevance, Rare Variants, Drug Response, etc (Free)

Moof

Senior Member
Messages
778
Location
UK
Thank you, @kday, just done mine again.

Brain fried today, so can't completely follow what you said about the BRCA thing...so here's what came out the second time, just for your info.

It's the same result as the first one; however, on the second report, something new relating to another condition was flagged. Nothing to worry about, but it's obviously the result of your work converting some of the 23andMe data to proper rsids.

Screenshot 2019-05-17 at 13.18.53.png
 

kday

Senior Member
Messages
369
Thank you, @kday, just done mine again.

Brain fried today, so can't completely follow what you said about the BRCA thing...so here's what came out the second time, just for your info.

It's the same result as the first one; however, on the second report, something new relating to another condition was flagged. Nothing to worry about, but it's obviously the result of your work converting some of the 23andMe data to proper rsids.

View attachment 32700
Looks like it's a bad variant.

If that's indeed a new screenshot, it looks like it's still mixing up variants at the same position somehow. Not sure why. 😔

Nevertheless, this variant is unreliable. The real rsid is rs397507333, which is prone to false results.

https://www.snpedia.com/index.php/Rs397507333

If you can double check by clicking on the OpenSNP link, it should take you to a variant called i5009143 I believe.
 

Moof

Senior Member
Messages
778
Location
UK
If you can double check by clicking on the OpenSNP link, it should take you to a variant called i5009143 I believe.

Yes, that's correct, it does. Sorry my eccentric SNPs are mucking things up! :rofl:
 

kday

Senior Member
Messages
369
@Rufous McKinney

Give me some time and I'll have YouTube tutorial videos that will targeted towards people of any skill level. I think with 5 or 10 minutes of basics can get most people going. And then I will offer longer videos that are more in-depth.

I don't think I'm going to build a reference document or manual. It's definitely not my learning style, and in my opinion, it's hard to write a manual that will explain things better than a video. But if enough people want a manual, I supposed I could write one. If I would write one it would be about best practices for genome interpretation and how to utilize the app's features and external resources.
 
Last edited:

Rufous McKinney

Senior Member
Messages
13,389
@kday

So there is no way to print out this report? Or save the info? (otherwise it seems to require the Web Access).

I understand if thats not possible, your just testing....RMcK
 

kday

Senior Member
Messages
369
@Rufous McKinney

In a browser, you can select File -> Save Page As (or however it works in your browser) to save the webpage. It will work offline.

Printing and CSV files are features that I will be adding. Saving a formatted PDF document may be an option in the future as well, I'm just not there yet.
 

kday

Senior Member
Messages
369
The service seems stable and things seem to be working as expected. In the next several days (could take longer, don't hold your breath), I'm hoping to support Whole Senome Sequencing (WGS/WES) VCF support for user testing. I already have it working on a development server, I just need to add some code so the server can detect that it's a VCF file so it knows to process it differently.

However, VCF files are very data heavy, so I first want to test WGS sequencing from Dante Labs. The reason for this is that Dante Labs VCF are quite a bit smaller in file size than others. I expect these files to process in about 2 minutes instead of less than 1. And larger files from other companies may take a little longer than 2 minutes to process. I'm trying to find efficient ways to keep the processing time as close to 2 minutes as possible.

Once that goes well, I will probably be adding support for companies like Veritas Genetics and Nebula Genomics. But this takes a redesign of how the upload system works (since these are much larger files), which is more complicated than it sounds.

Does anyone here have Dante Labs files?
 
Last edited:

Rufous McKinney

Senior Member
Messages
13,389
@kday

thank you! I exported as a PDF, that allowed me to save each category as one file. Its not interactive, just a pdf but thats nice. Ran my spouse thru there.

I happen to be married to: 58% Swedish 37% Norwegian- wow the Homozygous here is impressive. (or NOT depending)
 

Rufous McKinney

Senior Member
Messages
13,389
to save the webpage

Saving problems continue: it appears I did not successfully save my results as pdf: it keeps saving only the first two pages.

But saved my husbands....so I've now generated about 30 versions and am: KERFLUMOXXED.

But i understand thats not really your problem...I am actually famous for: things that work on computers dont work when I touch it (and with witnesses).
 

kday

Senior Member
Messages
369
Saving problems continue: it appears I did not successfully save my results as pdf: it keeps saving only the first two pages.

But saved my husbands....so I've now generated about 30 versions and am: KERFLUMOXXED.

But i understand thats not really your problem...I am actually famous for: things that work on computers dont work when I touch it (and with witnesses).
Weird. You can try a different browser. Chrome, Safari, Firefox, Edge, etc. I'll hopefully make print work right in the near future.
 

kday

Senior Member
Messages
369
If anybody has full genome sequencing for Veritas, Dante, etc, implementation for Whole Genome Sequencing (WGS/WES) has been finished. I will make it available for testing some time tomorrow. I'll let everyone know when I roll out the update.

However, it will not work with low pass sequencing (such as that from Nebula, Gencove, etc). It will work with Nebula clinical grade sequencing when it's released. And it will work with almost any other sequencing that's 20x coverage or greater.

For the first roll out, the service will only work if the files use the GRCh37/hg19 assembly. Files formatted for GRCh38/hg38 will not work until after the next update.

This may sound like technical jargon, but most of the clinical sequencing right now is in GRCh37/hg19 format. But there will soon be lots of sequencing using the updated hg38 assembly.

The amount of time it takes files to process WGS/WES files depends on the size and provider of the file. It will process a Dante Labs file in about 1 minute and 40 seconds and a Veritas Genetics file in around 3 minutes or so.

This is actually considered very fast, but I plan to make it process even faster in future releases.
 

kday

Senior Member
Messages
369
The new version has been rolled out that supports Whole Genome Sequencing (WGS/WES) .vcf.gz files.

Current limitations are that it only works with files that use the GRCh37/hg19 assembly (which is the vast majority of data right now). GRCh38/hg38 will be supported in the near future.

Also, nothing is filtered yet. This means some erroneous conditions may come up in the Genetic Conditions category such as ASPA/Canavan Disease, F5/Factor V Leiden (this may or may not be erroneous. I believe it is, but I have to verify). There also may be some erroneous BRCA SNPs or Lynch Syndrome SNPs that may appear in some genomes. This isn't a problem with the data. It's errors in the ClinVar database that need to be filtered out.

AncestryDNA and 23andMe doesn't have this issue as these errors have already been filtered out. There will be more filtering for WGS/WES in the next update so this incorrect information doesn't display.

If you upload a WGS/WES file containing multiple samples (multiple people), the last sample in the file will be processed. The app does not currently let you select which sample to process.

This is not compatible with low-pass sequencing. If you try to upload a low-pass file you will get an error. 20x sequencing or better is required.

Uploading large .vcf.gz files can take a lot of time on a slow internet connection and can potentially corrupt on upload. I initially that files might time out after 10 minutes of being uploading, but I did some testing, and it appears that my assumption may not be true. Please let me know if your file upload quits after 10 minutes when uploading a large file as I need to know if this can happen or not. I plan to install a more fault tolerant uploader in a future release.

If you have incorrect variants in the "Genetic conditions" category, let me know what they are (screenshots are good) so I can create a log on exactly what to filter out.

Same URL as before to upload 23andMe, Ancestry, and WGS/WES file.

https://hgpat.tinybox.io
 

wigglethemouse

Senior Member
Messages
776
@kday Klimas et al released the data for their 23andMe genetic study today. The full text of this study is now available here:
https://www.frontiersin.org/articles/10.3389/fped.2019.00206/full

@nandixon has a post here questioning data
https://forums.phoenixrising.me/threads/the-ido-metabolic-trap-guy.62727/post-2206434

I have some questions too on s4me thread (my personal v5 23andMe file has only 190 SNP's out of top 525 CADD SNPS listed in table 2, and BCAM one of their top highlighted genes has wrong frequency)
https://www.s4me.info/threads/genet...erez-nathanson-klimas-et-al.9415/#post-170872

Would it be possible to somehow run supplementary table 2 through your tools to check the frequency data and miscalls on the 23andMe and highlight errors that we could then pass on to interested parties? Table 1 and table 2 supplementary data is here
https://www.frontiersin.org/articles/10.3389/fped.2019.00206/full#supplementary-material

Many thanks,
Wiggle
 

nandixon

Senior Member
Messages
1,092