Hi there,
I am using MetaPhlAn 3 via YAMP (The YAMP workflow · alesssia/YAMP Wiki · GitHub) for the shotgun meta genome data analysis of the human gut microbiome data generated using shotgun sequencing on illumina platform (Paired end data). I am getting the output data for the bacteria, archaea and viruses. However, we have not detected any fungal signatures in the dataset. In the MetaPhlAn description it has been mentioned that it gives information on the eukaryote also, and in the ChocoPhlAn database I have seen the files for fungal taxa (so information on the fungal taxa is expected). Please let me know how can we get the information on fungal signatures.
Earlier I have tried directly by using MetaPhlAn 2 and I have not detected any fungal signatures there too. Please let me know how can we get the data on this.
Hi @Diptaraj
MetaPhlAn 3 does include (micro)eukaryotic species in the database. In particular you can check the list of species in the last database version here: http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2
If you are really sure your samples should contain micro eukaryotic species, you could try to run MetaPhlAn changing the --stat_q parameter in order to be more sensitive. This parameter is used when MetaPhlAn calculates the robust average coverage of a species, and defines the trimming of the markers distribution at both ends (more details about this in the following paper: Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3 | eLife). Its default value is 0.2, meaning that you need at least 20% of the markers of a species have to be present in the reads for a species to be detected. Adding to your MetaPhlAn 3 execution the parameter --stat_q 0.1 (or lower) might help in your case.
Hi @Diptaraj,
indeed also eukaryotes are included in MetaPhlAn database, in particular you can check the list of species included here http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2
You could try to run MetaPhlAn 3 lowering --stat_q parameter in order to be more sensitive. This parameter is used when MetaPhlAn calculates the robust average coverage of a species, its default value is 0.2 (trimming 20% of the markers distribution in both ends) and in practice it means that for detecting a species, by default, you need at least 20% of the markers to be present in the sample. Adding to your MetaPhlAn v3 execution the parameter --stat_q 0.1 might help in your case.