And just to confirm, is a polymorphic site one where the sample differs from the reference marker’s sequence or is a site at which there is ambiguity for a base call within the sample, which could arise due to multiple strains being present in the metagenomic sample, for instance?
Hi @raufs
It is the second case, when reconstructing the markers sequences using CMSeq (https://github.com/SegataLab/cmseq), a polymorphic site is called if the frequence of the dominand allele is lower than 80%.