Hi there,
I have finished running MetaPhlAn4 on my data and started analyzing the results.
I noticed two main issues which I believe require your attention:
-
For some of the samples the relative abundance at the species level sums up to 100% yet it is lower at the genus level. From what I understand, at least in my samples, the problem stems from the following clades:
k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Eubacteriaceae|g__Eubacterium|s__Eubacterium_ventriosum
k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Eubacteriaceae|g__Eubacterium|s__Eubacterium_sp_AF34_35BH
k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Eubacteriaceae|g__Eubacterium|s__Eubacterium_sp_AF22_9
k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Eubacteriaceae|g__Eubacterium|s__Eubacterium_SGB6276
k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Eubacteriaceae|g__Eubacterium|s__Eubacterium_SGB4329
k__Bacteria|p__Firmicutes|c__CFGB9301|o__OFGB9301|f__FGB9301|g__GGB53985|s__GGB53985_SGB6367
k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Peptostreptococcaceae|g__Romboutsia|s__Romboutsia_timonensis
k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Peptostreptococcaceae|g__Romboutsia|s__Romboutsia_hominis
k__Bacteria|p__Proteobacteria|c__CFGB3069|o__OFGB3069|f__FGB3069|g__GGB9770|s__GGB9770_SGB57575
k__Bacteria|p__Actinobacteria|c__Actinomycetia|o__Micrococcales|f__Micrococcaceae|g__Arthrobacter|s__Arthrobacter_sp_HMSC06H05
k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Eubacteriaceae|g__Eubacterium|s__Eubacterium_ramulus
k__Bacteria|p__Firmicutes|c__Clostridia|o__Eubacteriales|f__Eubacteriaceae|g__Eubacterium|s__Eubacterium_sp_AF22_8LB
k__Bacteria|p__Firmicutes|c__Erysipelotrichia|o__Erysipelotrichales|f__Erysipelotrichaceae|g__Catenibacterium|s__Candidatus_Catenibacterium_tridentinum
k__Bacteria|p__Firmicutes|c__CFGB1354|o__OFGB1354|f__FGB1354|g__GGB3304|s__GGB3304_SGB4367
k__Bacteria|p__Firmicutes|c__CFGB75916|o__OFGB75916|f__FGB75916|g__GGB2993|s__GGB2993_SGB3978
k__Bacteria|p__Proteobacteria|c__Alphaproteobacteria|o__Hyphomicrobiales|f__Bradyrhizobiaceae|g__Bradyrhizobium|s__Bradyrhizobium_viridifuturi -
Is it possible that there are some errors in the clades annotation? For example, under the genus “g__GGB79996” there are the following organisms:
k__Bacteria|p__Actinobacteria|c__CFGB10299|o__OFGB10299|f__FGB10299|g__GGB79996|s__GGB79996_SGB14375 k__Bacteria|p__Actinobacteria|c__CFGB10299|o__OFGB10299|f__FGB10299|g__GGB79996|s__GGB79996_SGB14375|t__SGB14375
k__Bacteria|p__Firmicutes|c__CFGB10299|o__OFGB10299|f__FGB10299|g__GGB79996
A similar issue appears with the genus “g__GGB1249” which appears in:
k__Bacteria|p__Bacteroidetes|c__CFGB76191|o__OFGB76191|f__FGB76191|g__GGB1249|s__GGB1249_SGB1670
k__Bacteria|p__Bacteroidetes|c__CFGB76191|o__OFGB76191|f__FGB76191|g__GGB1249|s__GGB1249_SGB1670|t__SGB1670
k__Bacteria|p__Firmicutes|c__CFGB76191|o__OFGB76191|f__FGB76191|g__GGB1249
It appears that these all share a common genus although they are annotated as belonging to a different phylum (otherwise, the class, order, and family all match). Could this be an error in the database annotation?
Thanks in advance,
Nadav