Hello,
I’m trying to run StrainPhlan from (MetaPhlan version 4.1.0 (23 Aug 2023), installed via conda) using the following command:
strainphlan -s consensus_markers/*.json.bz2 -m db_markers/t__SGB4608.fna -r reference/RT/GCF_000153925.1_ASM15392v1_genomic.fna -o output -n 8 -c t__SGB4608 --mutation_rates -d metaphlan_databases/mpa_vJan21_CHOCOPhlAnSGB_202103.pkl
The output is as follows:
Tue Oct 10 19:25:36 2023: Start StrainPhlAn 4.1.0 execution
Tue Oct 10 19:25:36 2023: Loading MetaPhlAn mpa_vJan21_CHOCOPhlAnSGB_202103 database…
Tue Oct 10 19:25:55 2023: Done.
Tue Oct 10 19:25:55 2023: Creating temporary directory…
Tue Oct 10 19:25:55 2023: Done.
Tue Oct 10 19:25:55 2023: Filtering markers and samples…
Tue Oct 10 19:25:55 2023: Getting markers from samples…
Tue Oct 10 19:26:00 2023: Done.
Tue Oct 10 19:26:00 2023: Getting markers from references…
Tue Oct 10 19:26:01 2023: Done.
Tue Oct 10 19:26:01 2023: Removing bad markers / samples…
Tue Oct 10 19:26:01 2023: Done.
Tue Oct 10 19:26:01 2023: Done.
Tue Oct 10 19:26:01 2023: Writing samples as markers’ FASTA files…
Tue Oct 10 19:26:02 2023: Done.
Tue Oct 10 19:26:02 2023: Calculating polymorphic rates…
Tue Oct 10 19:26:06 2023: Done.
Tue Oct 10 19:26:06 2023: Computing phylogeny…
Tue Oct 10 19:26:06 2023:
Tue Oct 10 19:26:06 2023:
Tue Oct 10 19:26:06 2023:stdout:
stderr:
usage: phylophlan [-h] [-i INPUT | -c CLEAN] [-o OUTPUT] [-d DATABASE]
[-t {n,a}] [-f CONFIG_FILE] --diversity {low,medium,high}
[–accurate | --fast] [–clean_all] [–database_list]
[-s SUBMAT] [–submat_list] [–submod_list] [–nproc NPROC]
[–min_num_proteins MIN_NUM_PROTEINS]
[–min_len_protein MIN_LEN_PROTEIN]
[–min_num_markers MIN_NUM_MARKERS]
[–trim {gap_trim,gap_perc,not_variant,greedy}]
[–gap_perc_threshold GAP_PERC_THRESHOLD]
[–not_variant_threshold NOT_VARIANT_THRESHOLD]
[–subsample {phylophlan,onethousand,sevenhundred,fivehundred,threehundred,onehundred,fifty,twentyfive,tenpercent,twentyfivepercent,fiftypercent,full}]
[–unknown_fraction UNKNOWN_FRACTION]
[–scoring_function {trident,muscle,random}] [–sort]
[–remove_fragmentary_entries]
[–fragmentary_threshold FRAGMENTARY_THRESHOLD]
[–min_num_entries MIN_NUM_ENTRIES] [–maas MAAS]
[–remove_only_gaps_entries] [–mutation_rates]
[–force_nucleotides] [–input_folder INPUT_FOLDER]
[–data_folder DATA_FOLDER]
[–databases_folder DATABASES_FOLDER]
[–submat_folder SUBMAT_FOLDER]
[–submod_folder SUBMOD_FOLDER]
[–configs_folder CONFIGS_FOLDER]
[–output_folder OUTPUT_FOLDER]
[–genome_extension GENOME_EXTENSION]
[–proteome_extension PROTEOME_EXTENSION] [–update]
[–citation] [–verbose] [-v]
phylophlan: error: unrecognized arguments: --strainphlan
It is halting at the PhyloPhlan step because it doesn’t recognise the “–strainphlan” argument… but I am not sure where this is coming in.
The output directory contains the following files:
phylophlan.cfg
reference_markers/GCF_000153925.1_ASM15392v1_genomic.fna.bz2
t__SGB4608.StrainPhlAn4/sample1.fastq.fna.bz2
t__SGB4608.StrainPhlAn4/sample2.fastq.fna.bz2
t__SGB4608.StrainPhlAn4/sample3.fastq.fna.bz2
Finally, the phylophlan config file looks as below:
[db_dna]
program_name = /home/users/egrant/miniconda3/envs/biobakery3/bin/makeblastdb
params = -parse_seqids -dbtype nucl
input = -in
output = -out
version = -version
command_line = #program_name# #params# #input# #output#[map_dna]
program_name = /home/users/egrant/miniconda3/envs/biobakery3/bin/blastn
params = -outfmt 6 -evalue 0.1 -max_target_seqs 1000000 -perc_identity 75
input = -query
database = -db
output = -out
version = -version
command_line = #program_name# #params# #input# #database# #output#[msa]
program_name = /home/users/egrant/miniconda3/envs/biobakery3/bin/mafft
params = --quiet --anysymbol --thread 1 --auto
version = --version
command_line = #program_name# #params# #input# > #output#
environment = TMPDIR=/tmp[trim]
program_name = /home/users/egrant/miniconda3/envs/biobakery3/bin/trimal
params = -gappyout
input = -in
output = -out
version = --version
command_line = #program_name# #params# #input# #output#[tree1]
program_name = /home/users/egrant/miniconda3/envs/biobakery3/bin/raxmlHPC-PTHREADS-SSE3
params = -p 1989 -m GTRCAT
input = -s
output_path = -w
output = -n
version = -v
command_line = #program_name# #params# #threads# #output_path# #input# #output#
threads = -T
I am a bit stumped and can’t find any similar errors reported. Any ideas as to why I am getting this odd error or how to troubleshoot would be much appreciated!