Dear Phylophlan team,
I was trying to reconstruct the prokaryotic tree by following the github tutorial. I started by downloading one genome from each species using this command
phylophlan_get_reference -g all -o $genomes -n 1, and ended up with 17509 genomes. I realized that these are DNA sequences, whereas the marker gene database is composed of amino acid sequences, which means I have to run Prodigal on these genomes to determine their amino acid sequences. This process would take a really long time to complete, and also might not be entirely necessary. I am wondering if there is a better way to complete this task? (i.e. to avoid the translation step). If the step is inevitable, can I get some perspective regarding how long it would take (including the rest of the pipeline). Thank you!