Panphlan Profiling Error

Hi,
I have been trying to use Panphlan on my data set as well as the example on the github page and for some reason when I try to do the profiling step it gives me this error for both my sample and the example one:

STEP 2. Create coverage matrix
Traceback (most recent call last):
File “/home/bcampbell/anaconda3/envs/panphlan/bin/panphlan_profiling.py”, line 948, in
main()
File “/home/bcampbell/anaconda3/envs/panphlan/bin/panphlan_profiling.py”, line 880, in main
dna_samples_covs = read_map_results(args.i_dna, args.verbose)
File “/home/bcampbell/anaconda3/envs/panphlan/bin/panphlan_profiling.py”, line 314, in read_map_results
dna_samples_covs[dna_sample_id] = read_gene_cov_file(dna_covs_file)
File “/home/bcampbell/anaconda3/envs/panphlan/bin/panphlan_profiling.py”, line 290, in read_gene_cov_file
f = open(input_file, mode=‘rt’)
FileNotFoundError: [Errno 2] No such file or directory: ‘DXB72_08775\t721’

I think it might be a problem with something we downloaded or need to download but not sure what it might be.

Also, do you know why some samples would work for the mapping step and others don’t. I have some samples that go super fast during the mapping process (almost the same file size as the others) and the output is 0.

Thank you for your help!!
Nichole

Hi,

it looks like PanPhlAn cannot find the result files from the mapping step. Are you sure you specified the right path ?

I’m not sure what you mean by “the output is 0”, do you have a 0 coverage for all gene families or is the result file empty ?

Hi!

Yeah I tried it many times for my samples and the example ones and it gives me the same error with the right paths to the mapping files.

Sorry! I meant that the file is empty for only some samples.

Thank you!

Hello,

you’re problem is simply a misuse of the input parameter. If you check the wiki here

--i_dna [or -i ] input directory countaining the panphlan_map.py results OR a text file with all paths of the files to input

In your case it seems that you specified the first file as input argument. Something like this:

panphlan_profiling.py -i map_results_erectale/CCMD34381688ST-21-0_erectale.csv --o_matrix output_matrix.tsv -p Eubacterium_rectale/Eubacterium_rectale_pangenome.tsv -v

So PanPhlAn understand that the given file is a list of all paths to the actual inputs files. DXB72_08775\t721 is the very first line of the CCMD34381688ST-21-0_erectale.csv file.

To fix it just specify

panphlan_profiling.py -i map_results_erectale/ --o_matrix output_matrix.tsv -p Eubacterium_rectale/Eubacterium_rectale_pangenome.tsv -v

Or if you want to run the profiling on a subset of the files, save their path into a txt file and give this one as input for PanPhlan

Hope that will solve your problem
Leonard