The bioBakery help forum

Confusing Instructions About Strain Analysis Using MetaPhlAn

I have noticed that some reference genomes have names like Fusobacterium nucleatum subsp. vincentii and Fusobacterium nucleatum subsp. animalis. How can I analyse them?

MetaPhlAn’s --help has a paragraph like:

  • Finally, to obtain all markers present for a specific clade and all its subclades, the
    -t clade_specific_strain_tracker should be used. For example, the following command
    is reporting the presence/absence of the markers for the B. fragilis species and its strains
    the optional argument --min_ab specifies the minimum clade abundance for reporting the markers

$ metaphlan -t clade_specific_strain_tracker --clade s__Bacteroides_fragilis metagenome_outfmt.bz2 --input_type bowtie2out -o marker_abundance_table.txt

But, below I see that it is missing from the list of valid values of -t. Why is it missing?

-t ANALYSIS TYPE Type of analysis to perform:
* rel_ab: profiling a metagenomes in terms of relative abundances
* rel_ab_w_read_stats: profiling a metagenomes in terms of relative abundances and estimate the number of reads coming from each clade.
* reads_map: mapping from reads to clades (only reads hitting a marker)
* clade_profiles: normalized marker counts for clades with at least a non-null marker
* marker_ab_table: normalized marker counts (only when > 0.0 and normalized by metagenome size if --nreads is specified)
* marker_counts: non-normalized marker counts [use with extreme caution]
* marker_pres_table: list of markers present in the sample (threshold at 1.0 if not differently specified with --pres_th
[default ‘rel_ab’]

Nonetheless, the analysis works, but I see strange IDs. How do I relate them back to familiar names such as subsp. animalis? Why is the first column heading Sample ID? What does 1 mean? It is a Boolean value of True?

#mpa_v30_CHOCOPhlAn_201901
#metaphlan -t clade_specific_strain_tracker --clade s__Fusobacterium_nucleatum OSCC_1-Pintermediate.bz2 --input_type bowtie2out
 --nproc 8 --bowtie2db databases/bacteriaMarkers/ --output_file OSCC_1-PstrainsMetagenome.txt
#SampleID       Metaphlan_Analysis
851__Q7P5X0__RN95_03310 1
851__Q8R657__RO08_04045 1
851__R9R952__CI111_08490        1
851__Q7P4W1__cmk2       1
851__C7XQ46__H848_00740 1

Can you create a step-by-step tutorial for GitHub which demonstrates this analysis, please?