Is it possible to use Phylophlan 3 with Metabat2 binned fasta from de novo bacterial genome assembly? Is there any specifical command or tutorial for this kind of situation?
Hi there, please have a look at the tutorials here: PhyloPhlAn3 · biobakery/biobakery Wiki · GitHub and in particular, you might be interested in:
- PhyloPhlAn 3.0: Example 03: Metagenomic application · biobakery/biobakery Wiki · GitHub
- PhyloPhlAn 3.0: Example 04: E. coli · biobakery/biobakery Wiki · GitHub
- PhyloPhlAn 3.0: Example 05: Proteobacteria · biobakery/biobakery Wiki · GitHub
If you should need more details about configurations and parameters you can refer to the PhyloPhlAn wiki here: Home · biobakery/phylophlan Wiki · GitHub
I hope these are of help.
Many thanks,
Francesco
Thanks a lot for you support.
I am trying to use the tutorial on Metagenomic application.
I locate my bin.fa in a folder called input_metagenomic but when I run these commands
phylophlan_metagenomic \
-i input_metagenomic \
-o output_metagenomic \
--nproc 4 \
-n 1 \
-d SGB.Jan19 \
--verbose 2>&1 | tee phylophlan_metagenomic.log
I get the following error result:
Traceback (most recent call last):
File "/opt/miniconda3/envs/phylophlan-2020.5/bin/phylophlan_metagenomic", line 7, in <module>
from phylophlan.phylophlan_metagenomic import phylophlan_metagenomic
File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/phylophlan/phylophlan_metagenomic.py", line 29, in <module>
import pandas as pd
File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/pandas/__init__.py", line 29, in <module>
from pandas._libs import hashtable as _hashtable, lib as _lib, tslib as _tslib
File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/pandas/_libs/__init__.py", line 13, in <module>
from pandas._libs.interval import Interval
File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
Thanks for reporting this. I don’t think it is related to PhyloPhlAn, and from a quick search, it seems it could be related to the numpy library. Can you try re-installing numpy in the phylophlan-2020.5
env?
Thanks, Francesco
Thanks a lot Francesco for your support, no it is working but I am getting another error…maybe because as reference I have only one species
File "/opt/miniconda3/envs/phylophlan-2020.5/bin/phylophlan", line 10, in <module>
sys.exit(phylophlan_main())
File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/phylophlan/phylophlan.py", line 3200, in phylophlan_main
standard_phylogeny_reconstruction(project_name, configs, args, db_dna, db_aa)
File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/phylophlan/phylophlan.py", line 3005, in standard_phylogeny_reconstruction
all_inputs = (os.path.splitext(os.path.basename(i))[0] for i in input_faa_clean)
UnboundLocalError: local variable 'input_faa_clean' referenced before assignment
Hi, the error you posted:
is not from phylophlan_metagenomic
though. This I think is coming from phylophlan
, correct?
Can you please provide the correct command line and the full output using the --verbose
option? Also having the content of the input folder would be helpful.
Many thanks,
Francesco
Yes I was following even the tutorial on S. aureus because I wanted to produce a graphlan input
this is the command line used
phylophlan \
-i input_bins \
-o output_isolates \
-d s__Desulfomicrobium_orale \
--trim greedy \
--not_variant_threshold 0.99 \
--remove_fragmentary_entries \
--fragmentary_threshold 0.67 \
--min_num_entries 135 \
-t a \
-f isolates_config.cfg \
--diversity low \
--force_nucleotides \
--nproc 4 \
--verbose 2>&1 | tee phylophlan__output_isolates.log
I think a problem is the min num entries
Great, thanks!
Can you please provide the full output of the command above? (the phylophlan__output_isolates.log
would work as well)
Also, can you provide the content of the input folder input_bins
? Does it contain the set of 135 genomes as described in the tutorial?
Thanks, Francesco
Yes of course I attach it here.
I set up with 2 cause I had only two genomes
phylophlan__output_isolates.tsv (8.0 KB)
Thanks for sending the log file.
So, the error could be due to the fact that MAFFT needs more than just 2 sequences to do the multiple sequence alignment.
You can verify what the problem is by running the command in the log file (I only removed the --quiet
param):
/opt/miniconda3/envs/phylophlan-2020.5/bin/mafft --anysymbol --thread 1 --auto output_isolates/tmp/markers/UniRef90-A0A109W5J4.fna
(if you want to report here the full output I’ll be happy to give it a look)
Many thanks,
Francesco