Large amount of unmapped reads metaphlan v4.0 soil community


I sequenced a soil community that I obtained by mixing soil with 0.85% saline and separating it off from the soil.

It was sent for metagenomic illumina sequencing and I analysed with the default settings of MetaPhlAn/4.0.6-foss-2022a (but added estimation of unclassified on the end).

for some reason, it is only picking up the archea (which makes up < 1 % of the sample), the rest of the reads are unmapped (>99%). I tried adjusting the requirements for a positive detection to 0.1, it picked up one bacterial species but still there were many unmapped reads >99%

The sequencing service I used generated a krona report which did have lots of unmapped reads (80%) but still picked up thousands of species of bacteria (typical soil organisms), additionally, I have previously amplicon sequenced the sample and have seen hundreds of genera detected.

I also grew the inoculum in broth and sequenced that, and 40 or so species were detected after growth (so the bacteria are in the inoculum)

Does anyone have any suggestions as to why I have so many unmapped reads and why its not picking up any bacteria at all, just 0.2% archea?

Something similar happen to me with my sand samples.
I suggest to move to Kaiju or Kraken, not perfect but are able to assign a useful percentage of the community.