I finally sequenced my poop. This was a thing I was wanting to do for a long time, I have been watching the microbiome scene evolving since around 2011 and when I had this opportunity I got very excited in know what lives in my gut, you know... "for science".
The methodology used was an amplicon sequencing on 16S gene at V3/V4 region. In case you don't know what that means do not worry, I will give a short explanation in the next paragraph, but beware, the explanation will have gross oversimplifications to keep things shorter and easier to digest.
The 16S gene is a big collection of letters (ACTG) that exists in almost all prokaryote organisms, it's commonly used to identify those microbes because it is "highly conserved" which can also be stated as doesn't change too much over time. If we isolate some bacteria, let's say Yersinia pestis (the one that causes the plague) and sequence it's entire genome it will have something between 500-7500 genes and one those genes will be the 16S and will likely be the same for all Yersinia pestis around the globe.
This is how the 16S gene for Yersinia pestis looks like:
ATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAGCGGCAGCGGGAAGTAGTTTACTACTTTGCCGGCGAGCGGCG GACGGGTGAGTAATGTCTGGGGATCTGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATGACCTCG CAAGAGCAAAGTGGGGGACCTTAGGGCCTCACGCCATCGGATGAACCCAGATGGGATTAGCTAGTAGGTGGGGTAATGGC TCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGG GAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTGTGAAGAAGGCCTTCGGGTT GTAAAGCACTTTCAGCGAGGAGGAAGGGGTTGAGTTTAATACGCTCAATCATTGACGTTACTCGCAGAAGAAGCACCGGC TAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCACGCAGGCG GTTTGTTAAGTCAGATGTGAAATCCCCGCGCTTAACGTGGGAACTGCATTTGAAACTGGCAAGCTAGAGTCTTGTAGAGG GGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACAAA GACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCTGTAAACGATGTCGACT TGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTCGACCGCCTGGGGAGTACGGCCGCAAGGTTAA AACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAACCTTACCTA CTCTTGACATCCACAGAATTTGGCAGAGATGCTAAAGTGCCTTCGGGAACTGTGAGACAGGTGCTGCATGGCTGTCGTCA GCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCACGTAATGGTGGGAA CTCAAGGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGAGTAGGGCTAC ACACGTGCTACAATGGCAGATACAAAGTGAAGCGAACTCGCGAGAGCCAGCGGACCACATAAAGTCTGTCGTAGTCCGGA TTGGAGTCTGCAACTCGACTCCATGAAGTCGGAATCGCTAGTAATCGTAGATCAGAATGCTACGGTGAATACGTTCCCGG GCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTAACCTTCGGGAGGGCGCTTAC CACTTTGTGATTCATGACTGG
After my poop was sampled, it was handled in a laboratory that extracted the DNA of a small subset of 16S gene from all the bacterias living in there and sequenced it on an Illumina Miseq. This small subset is called V3/V4 and is about 250bp.
My poop resulted in 123778 reads, I took a random one so you can see how it looks like:
CACTCCTACGGGAGGCAGCAGTAGGGAATCTTCCGCAATGGGCGAAAGCCTGACGGAGCAACGCCGCGTGAGTGAAGAAG GCCTTCGGATCGTAAAACTCTGTTATTAGGGAAGAACAAATGTGTAAGTAACTATGCACGTCTTGACGGTACCTAATCAG AAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAACACGTAGGTGGCAAGCGTTATCCGGAATTATTGGGCGTAAAGC GCGCGTAGGCGGTTTTTTAAGTCTGATGTGAAAGCCCACGGCTCAACCGTGGAGGGTCATTGGAA
If this subset was found in the full gene of Yersinia pestis below, I probably should pay a visit to my doctor, luckily it match the 16S gene for an Staphylococcus capitis which usually is harmless.
The analysis pipeline that I choose (Shaman) gave-me a taxon for 74% of my reads, the remaining 26% it wasn't able to infer a taxonomy at any level or didn't pass any of the quality control steps present in the pipeline.
Let's take a look at the Phylum data:
# PHYLUM # READS Firmicutes 50268 Bacteroidetes 40042 Proteobacteria 2172 Actinobacteria 62 Thaumarchaeota 51 Verrucomicrobia 21 Lentisphaerae 16
From this data we can get one insight, you see, there is this thing called F/B ratio, it's not entirely a consensus but if you have a number bigger than 1, you are probably overweight.
The F/B ratio, is the ratio between Firmicutes and Bacteroides, they both are usually the most abundant phylum in any poop you can get your hands on.
50268 (Firmicutes) / 40042 (Bacteroidetes) = 1.25
Yes, I'm a little overweight.
What about the species?
Remember that we sequenced only a small (~250bp) subset of 16S gene? Unfortunately this short amplicon is not enough to classify everything at species level. Shaman gave-me only 31119 reads distributed among 13 species.
# SPECIE # READS Faecalibacterium prausnitzii 16600 Prevotella stercorea 7340 Bacteroides uniformis 3914 Bacteroides eggerthii 1584 Prevotella copri 1086 Parabacteroides distasonis 234 Dorea formicigenerans 90 Lactobacillus salivarius 83 Bacteroides ovatus 72 Bifidobacterium adolescentis 51 Eubacterium biforme 35 Oxalobacter formigenes 19 Collinsella aerofaciens 11
The one that has more reads is Faecalibacterium prausnitzii, which is nice since this bacteria seems to play a role in helping fight inflammation and is usually decreased in Crohn's disease.
Another fun bacteria that I'm happy to have is the Oxalobacter formigenes, one function attributed to this bacteria is the Oxalate degradation which prevents the formation of Kidney Stones. Sadly oxalate-degrading activity cannot be detected in the gut of some individuals, one of the theories that explain this absence is the indiscriminate use of quinolones ( antibiotics like cipro ).
Indeed my last kidney sonogram doesn't show any sign of stones, but if there is a bacteria that prevent gallstones, I may be missing it...
There is a lot of literature about almost all my bacterias, but using my data alone they are mostly irrelevant. Even comparing my results with other big studies should be difficult, since difference in primers, sequencing methods, pipelines and reference database could create a lot of bias over my values.
But still, there is space for more knowledge extraction, I will definitely get back to it on a future article, but only after I write my own analysis pipeline.