High Unknown estimation in soil sample almost 80 %

when i run the default command mentioned in the MetaPhlAn 3 manual i am getting a high rate of unknown estimation i.e. 80 %
metaphlan SK_1-forward_paired.fq.gz,SK_1-reverse_paired.fq.gz,SK_1-forward_unpaired.fq.gz,SK_1-reverse_unpaired.fq.gz --bowtie2out sample1.bowtie2.bz2 --nproc 5 --bt2_ps very-sensitive-local --add_viruses --unknown_estimation --input_type fastq -o profiled_sample1.txt.

Can you suggest how can i reduce the unknown estimation. And what is the accepted normal for unknown estimation in case of soil samples.

1 Like

MetaPhlAn does not have a lot of representative genomes for microorganisms associated with soil, so the unknown value can be pretty high. There’s no way to reduce the unknown value apart from adding more markers from soil-associated microbes.