Peer-to-peer metagenomics investigation

undiagnosed

Senior Member
Messages
246
Location
United States
There are many people posting on multiple internet forums with persistent symptoms following an exposure to various body fluids. This thread is the new home of a conversation started here. Available clinical tests have not found an explanation. Those affected are contributing metagenomic sequencing data to facilitate a peer-to-peer investigation to attempt to find possible etiologies. If you would like to participate by contributing data, please request a kit for the appropriate sample type from Aperiomics.

A wiki page containing a directory of contributed data files, bioinformatic tool usage instructions, and other information is available here. If you would like to contribute to the wiki you can either create a GitHub account and request edit permissions or clone the repository and make a pull request.

I am working on my own case but would also like to have more time to help out others. If you would like to enable me to spend more time to help you with your own investigation, give advice on bioinformatics tools, etc. please support me on my Patreon page. I am currently doing a literature review on the latest metagenomic methods and experimenting with various tools to find the best approaches for our needs.
 
Last edited:

undiagnosed

Senior Member
Messages
246
Location
United States
I created a tool for estimating how many total sequencing reads will be needed to get at least the specified number of on-target reads at the specified probability level. The image below shows a screenshot.
4gYxKiT.png
The tool is available for use here. You can also download the source code and run the app locally from here.

The number of on-target reads is taken to be a random variable whose distribution is Poissonian. The rate is the product of the abundance and total number of reads. Please see Theorem 2 from Wendl, et. al. for more information. Depending on the expected pathogen abundance, you can determine if the number of reads generated by Aperiomics standard service will be sufficient or if a higher sampling depth and/or custom sample enrichment to increase pathogen abundance is required.
 

undiagnosed

Senior Member
Messages
246
Location
United States
@Omar88, what data? I have been investigating various virus detection techniques. I am also working to ensure that the sequencing sensitivity will be adequate for detecting low viral load RNA viruses on upcoming plasma runs. I only have zzz's data besides mine at this time.
 

patient.journey

Senior Member
Messages
443
@Omar88, what data? I have been investigating various virus detection techniques. I am also working to ensure that the sequencing sensitivity will be adequate for detecting low viral load RNA viruses on upcoming plasma runs. I only have zzz's data besides mine at this time.

i found a lot of viruses in your blood samples when i used one of the choices you gave us to chick on the data , as i told you there was 3 retroviruses ! two are a cat retroviruses and the other one is avian retro virus + 8 kind of mycoplsama and other a lot of stuff so i was wondering if those info through Taxonomer are correct ?

is it false positive ?
 

undiagnosed

Senior Member
Messages
246
Location
United States
I haven't looked at the items you mentioned closely, but since most of them infect non-human hosts, I am skeptical that there is anything there. I have looked at lentiviridae and herpesviridae classifications from Taxonomer that all ended up blasting to human RNA/DNA. So far I haven't found the Taxonomer results to be very accurate so I have been investigating other methods.
 

patient.journey

Senior Member
Messages
443
I haven't looked at the items you mentioned closely, but since most of them infect non-human hosts, I am skeptical that there is anything there. I have looked at lentiviridae and herpesviridae classifications from Taxonomer that all ended up blasting to human RNA/DNA. So far I haven't found the Taxonomer results to be very accurate so I have been investigating other methods.

could it be that there was something near to those viruses ? am not sure what is the similarity cut off Taxonomer use but could we just exclude them ?

if you are not using Taxonomer ! can you give us an other option so we can help
 

undiagnosed

Senior Member
Messages
246
Location
United States
It's possible that there could be something related. I've done some analysis with blastn and blastx on some limited reference databases that I constructed. If you want to check the taxonomer classifications, you'd need to download the raw taxonomer output file. You would then need to extract the reads you are interested based on the NCBI taxonomy ID for viruses (for bacteria it's a different ID). Then you would blastn the sequence to see what the top hits are. I have a python program that will pull out sequences of interest based on an input file of taxonomy ids. I can upload it to github if you want to use it.
 

Cheesus

Senior Member
Messages
1,292
Location
UK
I have had some sequencing done by Aperiomics. You're welcome to my data if you want it, however I do not have any of the raw files. I only have the final report.
 

undiagnosed

Senior Member
Messages
246
Location
United States
I recently had RNA and DNA sequenced from plasma samples. Nucleotide alignment searching hasn't resulted in an obvious answer. I am working on more advanced pathogen searches to see if I can find anything. I have done a little work on biomarkers including characterizing the plasma microbiome and investigating differential gene expression for the RNA data. The Aperiomics plasma DNA/RNA report showed a Cucumber green mottle mosaic virus and a relatively high abundance of the bacteria Pseudomonas aeruginosa. According to Aperiomics, Pseudomonas aeruginosa is seen fairly commonly in plasma samples, but not every sample, and generally at lower abundance. @WayTooSick also recently tested with a higher than usual abundance of this bacteria on a plasma sample. Pseudomonas aeruginosa can live on the skin so there is a possibility it entered the sample via the venipuncture and is not actually present in the plasma. Pseudomonas aeruginosa is an opportunistic pathogen and can cause problems for people with suppressed immune systems and can be a problem in Cystic Fibrosis patients. I highly doubt it is causative itself, but increased abundance may be indicative of underlying problems. There is a paper that shows differences in the plasma microbiome of HIV patients vs healthy controls. Pseudomonadales made up 72.31% of the HIV microbiome and only 3.66% of the healthy control microbiome. Pseudomonas aeruginosa is a member of Pseudomonadales. The image below shows the bacteria taxonomy generated by Taxonomer for the plasma DNA sample.

z1lghDV.png

From taxonomer, the most abundant bacteria at the order level are as follows:

Pseudomondales 12.9%
Proprionibacteriales 5.6%
Burkholderiales 5%
Corynebacteriales 4.2%
Micrococcales 3.7%
Lactobacillales 3.6%
Rhizobiales 2.6%
Streptomycetales 1.56%
Enterobacteriales 1.4%
Bacillales 1.2%
Clostridiales 1%

This characterization is distinct from both groups in the paper, but it is interesting that Pseudomondales is the most abundant order. I also performed de novo assembly with SPAdes and got a number of smallish (relative to the genome size) contigs for Pseudomonas aeruginosa strain S86968. I then mapped reads against that reference which evenly covered 26.9% of the genome at a mean depth of 0.5X, so confidence that it's there is high. The image below shows the coverage:

dR5ZP3D.png

I thought the Cucumber green mottle mosaic virus was odd, but the paper I referenced discusses how food viruses can enter the blood stream through the digestive tract specifically for the Pepper Mild Mottle Virus. So, this could be possible further evidence of gut microbe translocation, it needs further investigation. I mapped reads against a Cucumber green mottle mosaic virus reference genome and covered 6.2% of it as shown below:

ZI11bjm.png

The paper also showed HERV-K sequences that were found in the plasma for the HIV+ group and not the healthy controls. This may be due to the sampling depth used in that paper. I also found HERV-K sequences in my plasma DNA (99.9% coverage, 79X mean depth), coverage of HERV-K113 shown below:

3vBRtRP.png

In order to further explore this, I could attempt to quantify the amount per haploid genome and compare against controls as was done in this paper. I also had a number of BLASTn hits for bacterial phages which were shown to be prevalent in the HIV+ group but not the control group. Again, sampling depth could be a factor when comparing the data. I could do more investigation to confirm that the BLASTn hits are legitimate. I am also looking into doing differential gene expression on the host mRNA against healthy controls for whole blood from the GTEx study to see if I can find any up or down regulated genes relative to healthy controls.
 
Last edited:

patient.journey

Senior Member
Messages
443
Am driving but I couldn't not make a note on your great data !

Firstly Thanks for your effort brother

Secondly gene sequencing from Chinese patients did shows high HERV-K virus sequencing in all of them , a Chinese patient did tell me that one month a go and they are still looking into the data

**Am going through immune testing to see what upnormal biomarkers I can get since I couldn't do the gene sequencing here because of kits and samples rules

** Hope that we can find the pathogen , or at least immune and secondary infections biomarkers so we can fight in better way without the look medical system are giving us now

Regards
 
Messages
9
Hello, I just joined the forum. @undiagnosed , just wanted to know how are you and how is your cd4 count? Have you discovered anything new? I wrote about my problems, had some CD4 issues too and I am really thinking it has to be some kind of HIV-like virus involved. Thanx!
 

undiagnosed

Senior Member
Messages
246
Location
United States
Hi @mirta, sorry for the delay. I haven't had any labs or been to a doctor in years so I don't have any updates on biomarkers. I still have all the same symptoms and just try to get by the best I can.
 

Jammy88

Senior Member
Messages
163
Location
Italy
Hello,

i would be interested in testing my plasma as well. If you find a new company which can provide this service , please let me know.

thanks, best
 

undiagnosed

Senior Member
Messages
246
Location
United States
@Jammy88 I found a new place to do sequencing and I'm currently figuring out the logistics for getting blood drawn and plasma separation. Once I get everything figured out and get some data back and if things seems good I'll post an update with more information. Also it looks like the commerical Taxonomer isn't around anymore as the site just redirects to Illumina. I've been checking out CZID, another web based metagenomics analysis tool and I'll run new data through it when I have it.
 
Back