Visualization with GraPhlAn not generating appropriate plots. Missing nodes and labels

I am unable to generate similar results as shown in StrainPhlAn3 · biobakery/biobakery Wiki · GitHub Visualization with GraPhlAn. I am using all the demo files and able to generate tree image but it has only 5 nodes and also no labels on it. Also when I run it on my data (66 samples and 7 reference genome), it only display 4 nodes and no labels. Also is there a tutorial on ete2 or Jalview to produce phylogenetic tree?

Hi @arjun_kothari
Which version of StrainPhlAn are you currently using to generate the results (you can run strainphlan --version)
Unfortunately, we have no tutorials for the ete2 and Janlview tools, but they might have some in the official webpages.

strainphlan 3.0
I am unable to execute that command. I get following output:
[bmi-460g8-06 ~]$ strainphlan --version
usage: strainphlan [-h] [-d DATABASE] [-m CLADE_MARKERS]
[-s SAMPLES [SAMPLES …]] [-r REFERENCES [REFERENCES …]]
[-c CLADE] [-o OUTPUT_DIR] [-n NPROCS]
[–secondary_samples SECONDARY_SAMPLES [SECONDARY_SAMPLES …]]
[–secondary_references SECONDARY_REFERENCES [SECONDARY_REFERENCES …]]
[–trim_sequences TRIM_SEQUENCES]
[–marker_in_n_samples MARKER_IN_N_SAMPLES]
[–sample_with_n_markers SAMPLE_WITH_N_MARKERS]
[–secondary_sample_with_n_markers SECONDARY_SAMPLE_WITH_N_MARKERS]
[–sample_with_n_markers_after_filt SAMPLE_WITH_N_MARKERS_AFTER_FILT]
[–phylophlan_mode {accurate,fast}]
[–phylophlan_configuration PHYLOPHLAN_CONFIGURATION]
[–tmp TMP] [–mutation_rates] [–print_clades_only]
[–debug]
strainphlan: error: unrecognized arguments: --version
But if I just run “strainphlan”, I get :
[bmi-460g8-06 ~]$ strainphlan

[e] -s (or --samples) must be specified
Tue Sep 27 16:38:02 2022: Stop StrainPhlAn 3.0 execution.

I am having MetaPhlAn version 3.0.14 (19 Jan 2022)

Hi @arjun_kothari
The latest versions of StrainPhlAn 3 implement a more strict default parameters for filtering out samples and markers from the Phylogeny, so that could explain why you are not able to get the same samples as in the tutorial (that was built with one of the first versions of StrainPhlAn 3). However, it is really odd that you are not able to get the metadata labels in the graphlan visualization, could you share the tree and the metadata file you are using to reproduce the tutorial in the following link: Biobakery_forum_problems - Google Drive

Hi, I was able to plot all my samples by using the appropriate clade marker, but still I am not able to get annotations by using:
plot_tree_graphlan.py -t output_FINAL/RAxML_bestTree.s__Staphylococcus_aureus.StrainPhlAn3.tre.metadata -m subjectID

I uploaded my output folder here. Please check.

Hi @arjun_kothari
Try to include --string_to_remove .fastq.gz in the execution of add_metadata_tree.py