Unexpected difference with very-sensitive-local for environmental sample

Hello,
I am new to Metaphlan and playing around with the command line options.
I have an environmental sample as FASTQ file available (wastewater) and I am wondering what you would suggest to use as commandline options for Metaphlan in this case.

When I first ran Metaphlan with the --unclassified_estimation I got around 80% relative abundance flagged as UNCLASSIFIED.
So, I lowered the stat_q parameter and ignored the MAPQ value by adding the arguments --stat_q 0.1 --min_mapq_val -1 as suggested in the forum.
(e.g. Understanding Parameters (stat_q) for Environmental sample,
Which parameters to tweak to improve abundance calculations?)

In another post, I found the suggestion for longer reads that one can add a Bowtie2 parameter to do the alignment in the local mode, i.e in very-sensitive-local instead of the default very-sensitive mode, by using --bt2_ps very-sensitive-local. (e.g. MetaPhlan3 –unknown_estimation)

Interestingly, there are huge differences for the relative abundance, when I compare the Metaphlan profile result for 2 runs, where I used either the default very-sensitive mode and the very-sensitive-local Bowtie2 mode.
This confuses me, and I wonder what you would suggest to use resp. what makes more sense.

My raw reads of my samples have variable length. The average raw read length is between 179bp and 326bp.
The minimum raw sequence length is 25bp and the maximum raw sequence length varies between 707bp and 837bp.

Based on these read length stats, would you say it makes sense to use the local BowTie2 alignment preset mode?

Below I report the different profiling results of Metaphlan when running with or without the local Bowtie mode.
One can see that for the local mode the UNCLASSIFIED relative abundance is set to 0.0, though having around 63% for the non-local mode. This results in a distorted view of the abundance estimation.
Is there an error when computing the relative abundance in local mode?
When using the local mode, I see Eukaryota in the result and in general also some other Bacteria, that are not present when using the non-local mode.

#mpa_vJan21_CHOCOPhlAnSGB_202103
#/Users/bernhard/miniconda3/envs/metaphlan4_py3.9/bin/metaphlan ../raw/sample.fastq --input_type fastq --nproc 10 --bowtie2db ../metaphlan4_db_mac/ --read_min_len 60 --bowtie2out ./sample_bowtie2out_local.txt -o sample_profiled_metagenome_local.txt -t rel_ab_w_read_stats --unclassified_estimation --stat_q 0.1 --min_mapq_val -1 --bt2_ps very-sensitive-local
#3109526 reads processed
#Metaphlan_Analysis
#estimated_reads_mapped_to_known_clades:1371133
#clade_name	clade_taxid	relative_abundance	coverage	estimated_number_of_reads_from_the_clade
UNCLASSIFIED	-1	0.0	-	1738393
k__Bacteria	2	99.68134	0.41586	1331663
k__Archaea	2157	0.21061	0.00088	1769
k__Eukaryota	2759	0.10805	0.00045	37701
k__Bacteria|p__Planctomycetes	2|203682	83.63309	0.34891	1126429
k__Bacteria|p__Proteobacteria	2|1224	12.69694	0.05297	155939
k__Bacteria|p__Bacteroidetes	2|976	1.76021	0.00734	25752
k__Bacteria|p__Ignavibacteriae	2|1134404	0.82266	0.00343	13741
k__Archaea|p__Euryarchaeota	2157|28890	0.21061	0.00088	1769
k__Bacteria|p__Actinobacteria	2|201174	0.16971	0.00071	2452
k__Bacteria|p__Nitrospirae	2|40117	0.13247	0.00055	2283
k__Bacteria|p__Firmicutes	2|1239	0.12792	0.00053	1607
k__Eukaryota|p__Apicomplexa	2759|5794	0.10805	0.00045	37701
k__Bacteria|p__Tenericutes	2|544448	0.10707	0.00045	298
k__Bacteria|p__Spirochaetes	2|203691	0.1046	0.00044	1017
k__Bacteria|p__Chloroflexi	2|200795	0.08364	0.00035	1537
k__Bacteria|p__Verrucomicrobia	2|74201	0.02003	8e-05	406
k__Bacteria|p__Fusobacteria	2|32066	0.01314	5e-05	50
k__Bacteria|p__Chlamydiae	2|204428	0.00831	3e-05	104
k__Bacteria|p__Acidobacteria	2|57723	0.00153	1e-05	48

Below the non-local mode:

#mpa_vJan21_CHOCOPhlAnSGB_202103
#/Users/bernhard/miniconda3/envs/metaphlan4_py3.9/bin/metaphlan ../raw/sample.fastq --input_type fastq --nproc 10 --bowtie2db ../metaphlan4_db_mac/ --read_min_len 60 --bowtie2out ./sample_bowtie2out.txt -o sample_profiled_metagenome.txt -t rel_ab_w_read_stats --unclassified_estimation --stat_q 0.1 --min_mapq_val -1
#3109526 reads processed
#Metaphlan_Analysis
#estimated_reads_mapped_to_known_clades:996571
#clade_name	clade_taxid	relative_abundance	coverage	estimated_number_of_reads_from_the_clade
UNCLASSIFIED	-1	63.27951	-	2112955
k__Bacteria	2	36.65945	0.31195	995533
k__Archaea	2157	0.06104	0.00052	1038
k__Bacteria|p__Planctomycetes	2|203682	34.34729	0.29227	939923
k__Bacteria|p__Proteobacteria	2|1224	1.83997	0.01566	43584
k__Bacteria|p__Bacteroidetes	2|976	0.34498	0.00294	9912
k__Archaea|p__Euryarchaeota	2157|28890	0.06104	0.00052	1038
k__Bacteria|p__Tenericutes	2|544448	0.04194	0.00036	238
k__Bacteria|p__Firmicutes	2|1239	0.03929	0.00033	902
k__Bacteria|p__Spirochaetes	2|203691	0.03652	0.00031	724
k__Bacteria|p__Nitrospirae	2|40117	0.00321	3e-05	120
k__Bacteria|p__Chlamydiae	2|204428	0.00272	2e-05	69
k__Bacteria|p__Fusobacteria	2|32066	0.00194	2e-05	15
k__Bacteria|p__Actinobacteria	2|201174	0.0016	1e-05	46

One more thing… according to the manual the --bt2_ps flag is only applied when a FASTA file is provided, but nevertheless when I supplied a FASTQ file, the local mode is also added when bowtie-align is called. I am not sure if the manual is not up to date, or it makes any difference when I do not supply a FASTA file. Can you clarify that for me please?

 --bt2_ps BowTie2 presets
                        Presets options for BowTie2 (applied only when a FASTA file is provided)
                        The choices enabled in MetaPhlAn are:
                         * sensitive
                         * very-sensitive
                         * sensitive-local
                         * very-sensitive-local
                        [default very-sensitive]

If it makes any difference, I am using MetaPhlAn version 4.0.3 (24 Oct 2022).

What would you say, which command line options should I use for my environment sample with relatively high average read length. Does it make sense to use --stat_q 0.1 --min_mapq_val -1 --bt2_ps very-sensitive-local?

Thank you for your time!
Best regards,
Bernhard

Hi,

I have not used metaphlan in a long while and used it fairly briefly when I did. I think you might have more success if you re-post as a general post than as a direct message? Sorry if I have misinterpreted the notification and this isn’t a direct message. Good luck!

Liam