Hi there, thanks for getting in touch. It seems the problem is that your input folder only contains 3 genomes:
(from your log)
Mapping "phylophlan" on 3 inputs (key: "map_dna")
Mapping "./phylophlan_output/fasta_contigs_phylophlan_test_slurm/tmp/clean_dna/2001.contigs.fasta"
"2001.contigs.b6o.bkp" generated in 10182s
Mapping "./phylophlan_output/fasta_contigs_phylophlan_test_slurm/tmp/clean_dna/2002.contigs.fasta"
"2002.contigs.b6o.bkp" generated in 6509s
Mapping "./phylophlan_output/fasta_contigs_phylophlan_test_slurm/tmp/clean_dna/2003.contigs.fasta"
"2003.contigs.b6o.bkp" generated in 4223s
and then all markers are discarded because the minimum number of genomes for building a phylogeny is 4. Can you double-check the extension(s) of your input files and make sure all have the .fasta as you specified and then are correctly loaded?
This is my input files (2001.contigs.fasta, 2002.contigs.fasta, 2003.contigs.fasta). They are in .fasta format from Spades assembled contigs from fastq files generated by the shotgun sequencing. Each file is separate samples so I am running each fasta files for single phylophlan run and iterating the job. I am just running 3 samples for testing.
I cannot attach my input files here because it is large and does not accept attaching .fasta format but could you please let me know what might have caused the issue other than what you have specified in the previous response?
Hi @JWKANG, so if I understood correctly you have multiple bins (MAGs) in each of your 3 fasta files, right?
If so, then that’s the problem. PhyloPhlAn expects as inputs one file representing one genome/MAG/bin. So, you should ‘split’ your 3 fasta files into several fasta files, each one representing the different MAGs/bins. Otherwise, PhyloPhlAn will consider the 3 fasta files as representing 3 different MAGs/bins and that’s not a sufficient number for carrying out a phylogenetic analysis as the minimum in general is to have at least 4 genomes for reconstructing a phylogenetic tree.