Get all reference error

Hi @f.asnicar

I was trying to download reference genome for all the genomes using phylophlan_get_reference command and I got the following error:

Command used:
phylophlan_get_reference -g all -o /home/rakesh/PHYLO_ANALYSIS/PHYLO_TOL/ -n 1 --verbose 2>&1 | tee /home/rakesh/PHYLO_ANALYSIS/logs/phylophlan_get_reference.log

phylophlan_get_reference.py version 3.0.18 (27 November 2020)

Command line: /home/rakesh/anaconda3/envs/phylophlan/bin/phylophlan_get_reference -g all -o /home/rakesh/PHYLO_ANALYSIS/PHYLO_TOL/ -n 1 --verbose

Arguments: {‘get’: ‘all’, ‘list_clades’: False, ‘database_update’: False, ‘output_file_extension’: ‘.fna.gz’, ‘output’: ‘/home/rakesh/PHYLO_ANALYSIS/PHYLO_TOL/’, ‘how_many’: 1, ‘genbank_mapping’: ‘assembly_summary_genbank.txt’, ‘verbose’: True}
File “taxa2genomes.txt” present
File “taxa2genomes_cpa201901_up201901.txt.bz2” present
Output folder “/home/rakesh/PHYLO_ANALYSIS/PHYLO_TOL/” present
File “assembly_summary_genbank.txt” present
Traceback (most recent call last):
File “/home/rakesh/anaconda3/envs/phylophlan/bin/phylophlan_get_reference”, line 10, in
sys.exit(phylophlan_get_reference())
File “/home/rakesh/anaconda3/envs/phylophlan/lib/python3.10/site-packages/phylophlan/phylophlan_get_reference.py”, line 319, in phylophlan_get_reference
get_reference_genomes(args.genbank_mapping, taxa2genomes_file_latest, args.get, args.how_many,
File “/home/rakesh/anaconda3/envs/phylophlan/lib/python3.10/site-packages/phylophlan/phylophlan_get_reference.py”, line 265, in get_reference_genomes
gb_assembly_summary = dict([(r.strip().split(‘\t’)[0],
File “/home/rakesh/anaconda3/envs/phylophlan/lib/python3.10/site-packages/phylophlan/phylophlan_get_reference.py”, line 266, in
(r.strip().split(‘\t’)[19].replace(‘ftp://’, ‘https://’) + ‘/’ +
IndexError: list index out of range

Please help me resolve this error! Thanks in Advance!

Hi @saras22, thanks for reporting this. Unfortunately, I’m not able to reproduce the error. I run the same command (only changing the output folder) and it is running as expected.
Considering that PhyloPhlAn says:

and the error:

is when reading that file, I would suggest you remove the assembly_summary_genbank.txt and re-run the command so that the file will be re-downloaded.

Many thanks,
Francesco