Baqlava failing at depletion step due to misnamed file?

Hi again,

Hopefully this boils down to a simple user error but I’m running Baqlava again on some metagenomic samples and I’m getting a repeating error where it’s not finding the bowtie2_unaligned.fa file. The exact error is:

(Jan 28 15:37:20) [ 1/10 -  10.00%] **Failed   ** Task  0: Running HUMAnN to depete bacterial reads from file
Run Finished
Task 0 failed
  Name: Running HUMAnN to depete bacterial reads from file
  Original error: 
  Failed to produce target `/scr1/users/danielsg/CHOPMC-551_Thomas_shotgun/baqlava_out/Fib.1_1/Fib_temp/Fib_humann_temp/Fib_bowtie2_unaligned.fa'. Original exception: Traceback (most recent call last):
    File "/home/danielsg/miniconda3/envs/baqlava/lib/python3.10/site-packages/anadama2/runners.py", line 219, in _get_task_result
      targ_compares.append(list(target.compare()))
    File "/home/danielsg/miniconda3/envs/baqlava/lib/python3.10/site-packages/anadama2/tracked.py", line 379, in compare
      stat = os.stat(self.name)
  FileNotFoundError: [Errno 2] No such file or directory: '/scr1/users/danielsg/CHOPMC-551_Thomas_shotgun/baqlava_out/Fib.1_1/Fib_temp/Fib_humann_temp/Fib_bowtie2_unaligned.fa'

The strange bit is that I can see a file there about called:

Fib.1_1_bowtie2_unaligned.fa which is about 4.6GB in size. So I’m assuming that there are some viruses in there and it’s strange I’m getting this error.

Any help would be appreciated.

I found a little clue. I tried running it with --bypass-bacterial-depletion which allowed some of the samples to successfully produce Baqlava profiles (albeit not knowing whether those are actually bacterial reads aligning to viral markers – since I know these samples are highly enriched with bacteria). But something weird is happening, samples are getting renamed in the final output files:

I’m guessing Baqlava (or one of the compenent softwares) is stripping everything after the first dot.

I guess the workaround for now is to replace all instances of dots with underscores.

Yep, that was it. When I renamed all my files with the pattern “Fib.1_1” > “Fib_1_1” the error did not occur. This should be fixed or, at least, there should be something in the Readme saying “don’t use dots in your sample names”. Thanks.

Hey @scottdaniel_at_chop! You are correct that the behavior is due to how BAQLaVa handles filenames. This information is present in the readme under ‘Running BAQLaVa’ but as it is a single line, it may be easy to miss.