PhyloPhlAn 3 - E coli input produces "[e] No alignments found to concatenate"

Hello

I have installed PhyloPhlAn 3 from conda, and run the following command:

phylophlan -i proteins -d phylophlan --diversity medium -f phylophlan_configs/supermatrix_aa.cfg -t a

The directory “proteins” contains a single file, “ecoli.faa”, which is predicted proteins from >NC_000913.3 Escherichia coli str. K-12 substr. MG1655, complete genome

The output is:

Loading files from “/home/ubuntu/MAGpy_updated/MAGpy/proteins”
Loading files from “/home/ubuntu/MAGpy_updated/MAGpy/proteins”
Checking 1 inputs
Checking “/home/ubuntu/MAGpy_updated/MAGpy/proteins/ecoli.faa”
Cleaning 1 inputs
Cleaning “/home/ubuntu/MAGpy_updated/MAGpy/proteins/ecoli.faa”
“proteins_phylophlan/tmp/clean_aa/ecoli.faa” generated in 0s
Loading files from “proteins_phylophlan/tmp/clean_aa”
Mapping “phylophlan” on 1 inputs (key: “map_aa”)
Mapping “proteins_phylophlan/tmp/clean_aa/ecoli.faa”
“ecoli.b6o.bkp” generated in 389s
Selecting 1 markers from “proteins_phylophlan/tmp/map_aa”
Selecting “proteins_phylophlan/tmp/map_aa/ecoli.b6o.bkp”
“proteins_phylophlan/tmp/map_aa/ecoli.b6o.bz2” generated in 0s
Extracting markers from 1 inputs
Extracting “proteins_phylophlan/tmp/map_aa/ecoli.b6o.bz2”
“proteins_phylophlan/tmp/markers_aa/ecoli.faa.bz2” generated in 0s
Markers already aligned (key: “msa”)
Markers already trimmed (key: “trim”)
Markers already subsampled
Concatenating alignments

[e] No alignments found to concatenate

My guess is one of the steps must have failed?

of the directories in proteins_phylophlan/tmp/, some have content but others are empty:

$ ls -lrt proteins_phylophlan/tmp/trim_gap_trim/
total 0
$ ls -lrt proteins_phylophlan/tmp/msas/
total 0
$ ls -lrt proteins_phylophlan/tmp/markers
total 0
$ ls -lrt proteins_phylophlan/tmp/markers_aa/
total 108
-rw-rw-rw- 1 ubuntu ubuntu 107338 Aug 26 09:29 ecoli.faa.bz2
$ ls -lrt proteins_phylophlan/tmp/map_aa/
total 8132
-rw-rw-rw- 1 ubuntu ubuntu 8314679 Aug 26 09:29 ecoli.b6o.bkp
-rw-rw-rw- 1 ubuntu ubuntu    3314 Aug 26 09:29 ecoli.b6o.bz2

Think I need to alter this option

--min_num_entries MIN_NUM_ENTRIES

Hi @BioMickWatson and thanks for your message.

So, it seems you’re trying to build a phylogeny with only one single genome/proteome (proteins/ecoli.faa), which is a bit strange.

Yes, you can lower this param, but the default value of 4 was set because downstream FastTree or RAxML will give you an error if they don’t find that minimum number of leaves in the MSA to reconstruct the phylogeny.

Maybe you wanted to do something different and the documentation wasn’t clear enough. So, please let me know if I can be of any help.

Thanks,
Francesco