Samfile error when applying strainphlan to output .bz2 files of metaphlan

Hi,
I am trying to use strainphlan to build a phylogenetic tree, but when I apply sample2markers.py to the out put of metaphlan, I got errors:
Processing sample: /mnt/vstor/CSE_CSDS_VXC204/rxl761/consensus_markers/tmpxbb3zv62/11808.3205.SALB.W2V1.bam
Fri Feb 16 14:33:35 2024: Loading the bam file and extracting information…
Traceback (most recent call last):
File “/mnt/vstor/CSE_CSDS_VXC204/rxl761/anaconda3/envs/metaphlan/bin/sample2markers.py”, line 8, in
sys.exit(main())
File “/mnt/vstor/CSE_CSDS_VXC204/rxl761/anaconda3/envs/metaphlan/lib/python3.7/site-packages/metaphlan/utils/sample2markers.py”, line 399, in main
sampletomarkers.run_sample2markers()
File “/mnt/vstor/CSE_CSDS_VXC204/rxl761/anaconda3/envs/metaphlan/lib/python3.7/site-packages/metaphlan/utils/sample2markers.py”, line 281, in run_sample2markers
filtered=‘_filtered’ if len(self.clades) > 0 else ‘’)
File “/mnt/vstor/CSE_CSDS_VXC204/rxl761/anaconda3/envs/metaphlan/lib/python3.7/site-packages/metaphlan/utils/sample2markers.py”, line 119, in build_consensus_markers
results = self.get_consensuses_for_sample(i)
File “/mnt/vstor/CSE_CSDS_VXC204/rxl761/anaconda3/envs/metaphlan/lib/python3.7/site-packages/metaphlan/utils/sample2markers.py”, line 141, in get_consensuses_for_sample
sam_file = pysam.AlignmentFile(input_bam)
File “pysam/libcalignmentfile.pyx”, line 748, in pysam.libcalignmentfile.AlignmentFile.cinit
File “pysam/libcalignmentfile.pyx”, line 997, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode=‘r’) - is it SAM/BAM format? Consider opening with check_sq=False

I used conda to install metaphlan, strainphlan and samtools, and my version for metaphlan is 4.0.6, strainphlan is 4.0.6, and samtools is 1.9

Hi burdungy,

it seems the file you’re providing as input (11808.3205.SALB.W2V1.bam) is “empty”. You can verify it by running samtools view on the file. The bam file was generated by metaphlan? (usually it’s .sam.bz2 instead of .bam) Do you have the profile output from metaphlan? Does it contain any species? Either something went wrong with the metaphlan run or there’s not enough bacterial reads in the sequencing file.

Best
Michal

I have also encountered a similar error when using sample2markers.py script, as part of MetaPhlAn v4.0.6.

For me, running metaphlan4 on these samples went smoothly and resulted in a normal taxonomy profile output, as well as a .sam.bz2 file which isn’t empty that I use as input for sample2markers.py script.

To the best of my understanding, the sample2markers.py script generates the .bam file within a temporary directory and it does appear that the .bam file is empty.

I will appreciate any help you can provide

1 Like