The bioBakery help forum

How to draw a beautiful evolutionary tree

Dear Francesco

When I run example2 ,I get data result file input_genomes.tre.treefile
Then I want to visual this tree, How can I find the phylum annotation about GCA_* files?

less input_genomes
GCA_001823915.fna.gz  
GCA_002773335.fna.gz
GCA_000520875.fna.gz  
GCA_001567325.fna.gz  
...
GCA_001823925.fna.gz  
GCA_002773355.fna.gz
GCA_000521385.fna.gz  

zcat GCA_*.fna.gz|grep ">" give me the some info about species,

>CP019368.1 Borrelia turicatae 91E135 plasmid lpT39, complete sequence
>CP019369.1 Borrelia turicatae 91E135 plasmid lpU43, complete sequence
>CP000061.1 Aster yellows witches'-broom phytoplasma AYWB strain AY-WB chromosome, complete genome
>CP000062.1 Aster yellows witches'-broom phytoplasma AYWB strain AY-WB plasmid pAYWB-I, complete sequence
>CP000063.1 Aster yellows witches'-broom phytoplasma AYWB strain AY-WB plasmid pAYWB-II, complete sequence
>CP000064.1 Aster yellows witches'-broom phytoplasma AYWB strain AY-WB plasmid pAYWB-III, complete sequence
>CP000065.1 Aster yellows witches'-broom phytoplasma AYWB strain AY-WB plasmid pAYWB-IV, complete sequence
>CP000082.1 Psychrobacter arcticus 273-4, complete genome
>CP000096.1 Chlorobium luteolum DSM 273, complete genome
>CP000100.1 Synechococcus elongatus PCC 7942, complete genome
>CP000101.1 Synechococcus elongatus PCC 7942 plasmid 1, complete sequence
>CP000102.1 Methanosphaera stadtmanae DSM 3091, complete genome
>CP000109.2 Thiomicrospira crunogena XCL-2, complete genome
>CP000115.1 Nitrobacter winogradskyi Nb-255, complete genome

How can I map species to phylum like following?

GCA_001823915.fna.gz  d__Bacteria.p__Acidobacteria.c__Acidobacteria.o__Acidobacte...
GCA_002773335.fna.gz  d__Bacteria.p__Acidobacteria.c__Acidobacteria.o__Acidobacte...

Can you tell me how the following picture is drawn?
Looking forward to your answer

Chengkai

Dear Chengkai,

If you followed PhyloPhlAn 3.0: Example 02: Tree of life, you should have run the following command:

phylophlan_get_reference -g all \
    -o input_genomes/ \
    -n 1 \
    --verbose 2>&1 | tee logs/phylophlan_get_reference.log

The above command should have downloaded a file named taxa2genomes_cpa201901_up201901.txt.bz2 which contains exactly the taxonomic information you’re looking for.