Problems running HUMAnN3.6 - MetaPhlan database error

Hi there,

I recently installed and set up my conda environment for HUMAnN3.6, including MetaPhlan4. I then tried to run HUMAnN3.6 across multiple samples in parallel using a slurm array in attempt to save time.
Although all samples were treated equally in my slurm array script, it appears that HUMAnN managed to run on some samples, but for others there have been issues with the MetaPhlan database index.
I have received two type of errors both of which seem to occur during the “Joining FASTA databases” step:

  1. FileNotFoundError: [Errno 2] No such file or directory: 'path_to_metaphlan_db/mpa_vJan21_CHOCOPhlAnSGB_202103.fna'
  2. OSError: [Errno 116] Stale file handle

To note, the files in the MetaPhlan database folder are as follows:

I tried to re-run my script to check if the failure/success of HUMAnN was consistent for each sample but it appears that now the errors pop up for samples which did manage to run yesterday.

I would highly appreciate any help/insight regarding this issue,
Thanks!

When you are running MetaPhlAn for the first time it needs to download and build it’s index. It’s possible that trying to perform that initial index in an array context led to errors (e.g. of multiple jobs were trying to build the same file simultaneously, or trying to reference the built file before it was ready). If you do a demo run of MetaPhlAn on its own to make sure the database was successfully built, then you should be able to run subsequent jobs in an array context just fine.