database version: mpa_vJun23_CHOCOPhlAnSGB_202403
software version: MetaPhlAn version 4.1.1 (11 Mar 2024)
I am a beginner.
When I run metaphlan using the following code and draw a stacked plot to see the species composition. There are many GGB and SGB(species-level genome bins) species in the top species.These species do not have corresponding NCBI IDs. How should I deal with these species in subsequent analyses?
Hello, I am also facing the same issue.
As some of these species are emerging as some of the most abundant and prevalent, I have been unable to find any literature or predict their functional role. Is there any way we can get these genomes? so that one can at least explain the presence of these SGBs based on their metabolic potential.
Any suggestions are welcome.
Thank you
In my knowledge, SGB, GGB and FGB are as unnamed species, genus and families. They are not unknown as long they are not included in that group but they can not be assigned to a known and named species, genus or family, respectively.
I usually include them in the analysis because I hope they will be taxonomically named in the future.
as you correctly stated, these are unknown SGBs for which no isolate exists yet, only MAGs, and therefore we cannot have a corresponding taxonomy from NCBI. If you look at the full profile you can know higher taxonomic levels for these SGBs to have at least some information on their taxonomy.
What you can do to get more information is assembling your metagenomic samples and assign the bins to SGBs using the PhyloPhlAn routine phylophlan_assign_sgbs.py (see tutorial). If you manage to reconstruct the genome you are interested in, it will be assigned the corresponding uSGB and you can further study your genome.
Thank you for your replies.
I did assemble genomes, but did not retrieve those SGBs by assembling, but some SGBs are coming abundant in read based analysis (perhaps the depth wasn’t enough to assemle these gnomes). So I was curious as to what is the function of these SGBs, anyway I can get the SGB genomes from Phylophlan or chocophlan databases?