Is it possible to use Phylophlan 3 with Metabat2 binned fasta from de novo bacterial genome assembly? Is there any specifical command or tutorial for this kind of situation?
Hi there, please have a look at the tutorials here: PhyloPhlAn3 · biobakery/biobakery Wiki · GitHub and in particular, you might be interested in:
- PhyloPhlAn 3.0: Example 03: Metagenomic application · biobakery/biobakery Wiki · GitHub
- PhyloPhlAn 3.0: Example 04: E. coli · biobakery/biobakery Wiki · GitHub
- PhyloPhlAn 3.0: Example 05: Proteobacteria · biobakery/biobakery Wiki · GitHub
If you should need more details about configurations and parameters you can refer to the PhyloPhlAn wiki here: Home · biobakery/phylophlan Wiki · GitHub
I hope these are of help.
Thanks a lot for you support.
I am trying to use the tutorial on Metagenomic application.
I locate my bin.fa in a folder called input_metagenomic but when I run these commands
phylophlan_metagenomic \ -i input_metagenomic \ -o output_metagenomic \ --nproc 4 \ -n 1 \ -d SGB.Jan19 \ --verbose 2>&1 | tee phylophlan_metagenomic.log
I get the following error result:
Traceback (most recent call last): File "/opt/miniconda3/envs/phylophlan-2020.5/bin/phylophlan_metagenomic", line 7, in <module> from phylophlan.phylophlan_metagenomic import phylophlan_metagenomic File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/phylophlan/phylophlan_metagenomic.py", line 29, in <module> import pandas as pd File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/pandas/__init__.py", line 29, in <module> from pandas._libs import hashtable as _hashtable, lib as _lib, tslib as _tslib File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/pandas/_libs/__init__.py", line 13, in <module> from pandas._libs.interval import Interval File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
Thanks for reporting this. I don’t think it is related to PhyloPhlAn, and from a quick search, it seems it could be related to the numpy library. Can you try re-installing numpy in the
Thanks a lot Francesco for your support, no it is working but I am getting another error…maybe because as reference I have only one species
File "/opt/miniconda3/envs/phylophlan-2020.5/bin/phylophlan", line 10, in <module> sys.exit(phylophlan_main()) File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/phylophlan/phylophlan.py", line 3200, in phylophlan_main standard_phylogeny_reconstruction(project_name, configs, args, db_dna, db_aa) File "/opt/miniconda3/envs/phylophlan-2020.5/lib/python3.8/site-packages/phylophlan/phylophlan.py", line 3005, in standard_phylogeny_reconstruction all_inputs = (os.path.splitext(os.path.basename(i)) for i in input_faa_clean) UnboundLocalError: local variable 'input_faa_clean' referenced before assignment
Hi, the error you posted:
is not from
phylophlan_metagenomic though. This I think is coming from
Can you please provide the correct command line and the full output using the
--verbose option? Also having the content of the input folder would be helpful.
Yes I was following even the tutorial on S. aureus because I wanted to produce a graphlan input
this is the command line used
phylophlan \ -i input_bins \ -o output_isolates \ -d s__Desulfomicrobium_orale \ --trim greedy \ --not_variant_threshold 0.99 \ --remove_fragmentary_entries \ --fragmentary_threshold 0.67 \ --min_num_entries 135 \ -t a \ -f isolates_config.cfg \ --diversity low \ --force_nucleotides \ --nproc 4 \ --verbose 2>&1 | tee phylophlan__output_isolates.log
I think a problem is the min num entries
Can you please provide the full output of the command above? (the
phylophlan__output_isolates.log would work as well)
Also, can you provide the content of the input folder
input_bins? Does it contain the set of 135 genomes as described in the tutorial?
Yes of course I attach it here.
I set up with 2 cause I had only two genomes
phylophlan__output_isolates.tsv (8.0 KB)
Thanks for sending the log file.
So, the error could be due to the fact that MAFFT needs more than just 2 sequences to do the multiple sequence alignment.
You can verify what the problem is by running the command in the log file (I only removed the
/opt/miniconda3/envs/phylophlan-2020.5/bin/mafft --anysymbol --thread 1 --auto output_isolates/tmp/markers/UniRef90-A0A109W5J4.fna
(if you want to report here the full output I’ll be happy to give it a look)