Phylophlan running in loop

Hi @fbonardi!

I used the following command to get the phylogenetic tree of the MAGs in the input folder which were in fasta format but I am not able to understand how it is running the analysis, I think there is something wrong with the command I have used because it is doing the same things again like mapping and cleaning of the same file created in the tmp directory.

Command used: phylophlan --input PHYLO_IN/ -d phylophlan -t a --databases_folder PHYLO_DB/ --diversity high --output_folder PHYLO_OUT -f PHYLO_OUT/supermatrix_aa.cfg --genome_extension fasta

terminal output:
phylo_terminal_out.txt (55.6 KB)

My goal is to get the phylogenetic trees for the MAGs that I am providing phylophlan3 which are in fasta format which are coming from different samples and might be belonging to the same species. My other question is that can we get the phylogenetic tree of all the genome bins at species level. Also I am getting really confused, can anyone tell me the set of commands that I should be using for my goal?

Kindly guide me!

Thanks in Advance!

Hello @saras22,

Thank you for using PhyloPhlAn. I don’t think you’re doing anything wrong actually. Have you let PhyloPhlAn complete the command you posted above or did you interrupt the job?

The fact that you see twice the cleaning, selecting and mapping is due to your inputs being genomes and the database being of amino acids. In this case, PhyloPhlAn will first do a translated search, then will create temporary proteomes and re-do the search in the amino acids space. If you want to avoid this you can specify the --force_nucleotides parameter (note that if you specify that parameter you have to re-generate the configuration file as well, specifying the same parameter to correctly tune the tools for MSA and phylogeny inference).
Please, have a look a the documentation here.

Please, let me know if something is not clear.

Many thanks,
Francesco

1 Like