I have used the following parameters for assigning the phylogeny of E.coli strains (Metagenome Assembled Genomes) with the standard E.coli reference genomes:
COMMAND: phylophlan -i /lustre/rsharma/MAIN_DREPOUT_ABOVE_50_ANI99_NC0.3_con_10/ABRICATE_PATH/E_coli_bins/ECOLI_PHYLO/INPUT_ECOLI/ -o /lustre/rsharma/MAIN_DREPOUT_ABOVE_50_ANI99_NC0.3_con_10/ABRICATE_PATH/E_coli_bins/ECOLI_PHYLO/INPUT_ECOLI/OUT_PAPER_STRAIN_COMPARISON/ -d /lustre/rsharma/PHYLO_ANALYSIS/Ecoli/s__Escherichia_coli/ -t a -f /lustre/rsharma/MAIN_DREPOUT_ABOVE_50_ANI99_NC0.3_con_10/ABRICATE_PATH/E_coli_bins/ECOLI_PHYLO/references_config.cfg --diversity low --fast --genome_extension .fa --nproc 40
It is assigning the wrong phylogeny to one of the Standard E.coli reference genomes which is a SAKAI strain that should go with other strains belonging to phylogroup E but it is going with phylogroup B1 instead. Do you have any idea why it must be happening?
For the rest of the strains, it is showing the correct phylogeny.
Please provide some clues so that I can get a logical explanation for this issue.
Thanks in Advance!!!
Hi @saras22, thank you for using PhyloPhlAn and reporting this!
Have you checked the multiple-sequence alignments (MSAs) file and how it looks for the SAKAI strain? It could be that markers were not properly identified in such strain and the MSAs might contains too many gaps, which could explain the wrong phylogenetic placement.
Thanks a lot,
Hi @f.asnicar !
Thanks for replying, according to your suggestion I went and checked the concatenated MSa file and I found many gaps in SAKAI strain. What can I do to resolve this?
I also wanted to add 100% bootstrap to the tree. can I Get it from bestree that I already have or I have to make the tree all over again? Please let me know which is the file that is to be modified and what is the exact parameter.
So, for the issue of the gaps you can use several params available in PhyloPhlAn, like
--fragmentary_threshold 0.67, to provide some examples.
Alternatively, you can manually remove the SAKAI entry from the MSA and reconstruct the phylogeny from the cleaned MSA.
For the bootstrap analysis, first, you should have a look at the RAxML manual to decide which bootstrap analysis you want to run. Then you can just report into the config file the parameters so that you can directly build the phylogeny with bootstrap support.
Alternatively, you can just re-run RAxML with the bootstrap parameters using the concatenated MSA from the output folder.