CLR normalization for Metaphlan proportions


Looking for opinions on CLR normalization when performing downstream analysis on MetaPhlAn proportions output (0-100%).

Should CLR normalization be applied before:

  1. Alpha diversity index calculations, such as Shannon, Pielou evenness, Inverse Simpson etc.?
  2. PCA and CCA
  3. Differential abundance testing

Can I use Pearsons or Spearmans correlation testing with the proportional data and does this kind of testing also need CLR normalization?

Also, should one filter taxa with low abundance across samples before performing aforementioned calculations? For example, at the genus level?

Any help would be appreciated,
All the best!

Hi @Ivory,

As you may know there are many different types of normalization that can be applied to microbiome data. In terms of Maaslin2 results we found that it performs best on taxonomic data using the default parameters which is normalization with TSS and then a log transformation. However, we are actively researching other procedures to improve potential issues with compositionally.

In terms of other analysis, alpha diversity metrics and PCA can be conducted on both TSS scaled or CLR data. The important thing to keep in mind is that TSS and CLR are viewing the data from different angles. Where TSS is examining the data in context of a proportion of a whole, CLR is examining the data as the log-fold difference between the microbe of interest and the geometric mean.

Hope that is helpful,
Jacob Nearing