Metaphlan-strainphlan: sam.bz2

The strainphlan tutorial describes the method of creating *.sam.bz2 files for input to samples2markers.py as part of the strainphlan pipeline:

mkdir -p sams
mkdir -p bowtie2
mkdir -p profiles
for f in fastq/SRS*
do
    echo "Running MetaPhlAn on ${f}"
    bn=$(basename ${f})
    metaphlan ${f} --input_type fastq -s sams/${bn}.sam.bz2 --bowtie2out bowtie2/${bn}.bowtie2.bz2 -o profiles/${bn}_profiled.tsv
done

However, at least when I run metaphlan4 via the quay.io/biocontainers/metaphlan:4.0.3--pyhca03a8a_0, the SAM output files are not bz2-compressed. This lack of compression doesn’t really matter, except for the error produced by samples2markers.py:

 [Error] The the input file "MY_SAMPLE.sam" must be in "BZ2" format

Do the strainphlan docs need to be updated?

Hi @nick-youngblut . By default, sample2markers expect the input to be a compressed sam file. If not compress, you should specify the input format with the -i parameter (e.g. -i sam)

Thanks for clarifying! So the code in the strainphlan tutorial (as posted above) actually generates uncompressed SAM files and not bzip2-compressed SAM files, correct?