I have been running Strainphlan for the first time and followed the tutorial and manual. The first steps were successful until when I calling strainphlan to build the multiple sequence alignment and the phylogenetic tree by executing Phylophlan.
I have reinstalled Strainphlan and used the latest version. Before this version I tried with the version I installed in May 2020. The older version finalised this step, but couldn’t remove tmp file, so I decided to update Strainphlan and try again but it the newer version didn’t work though this step.
Here is my script ( I didn’t include yet any reference genome for Cutibacterium acnes):
Hi @sinkko
Yes, it could actually be a problem with the old version of PhyloPhlAn, we got reported some problems with the BLAST execution in older versions (that seems the stage your execution got stuck) and conda seems to have problems sometimes to retrieve the last version when installing metaphlan. I would try to install the last version of PhyloPhlAn: conda install -c bioconda phylophlan and try to execute it again. Let me know if this fixes your issue.
That helped, thanks! My next question is that as far as I have understood there are reference genomes used in the MetaPhlAn database. Can I use those for strainphlan? I didn’t add the reference genome yet but I would like to, I just have hard time to find them. So where can I find them? Or should I use some public databases to retrieve a reference genome?
Hi @sinkko
For building the MetaPhlAn markers database we downloaded all reference genomes available through UniProt Proteomes and linked to the public DDBJ, ENA, and GenBank repositories. However since they are already available in those public repositories we didn’t make them available to download from our servers. I would suggest you to download them directly from GenBank, e.g: Cutibacterium acnes - Assembly - NCBI