Best normalization/transformation strategy in MaAsLin2 for 16S rRNA (Nanopore) data?

Hello,

I am working with a 16S rRNA dataset (raw reads obtain by emu using Nanopore sequencing) from a longitudinal study with different treatment groups and repeated measures over time (several days per sample).

The goal is to identify differential abundances associated with group, time, and their interactions, using MaAsLin2 with random effects for animal ID.

I have a few methodological questions regarding normalization and transformation:

  1. Which is the best normalization and transformation method? Currently, I am using the default parameters (TSS for normalization and LOG for transform with LM of analysis method).

  2. Are there specific recommendations for 16S rRNA data? For instance, would it be preferable to use CLR (Centered Log-Ratio) transformation instead of TSS+LOG?

I would greatly appreciate any guidance on the most recommended practice for this type of data and experimental design.

Thank you very much!:smiley:

Carla.

I am also using ONT full length 16S Emu data. Emu outputs relative abundances rather than raw read counts. You can get estimated read counts with the specific tag. I’m curious to see the response. I was planning on CLR for beta diversity and the maaslin approach for differential abundance analysis.

1 Like

Sorry I missed the original thread earlier - I follow MaAsLin but not the general analysis and stats forum.

  1. I would use TSS and LOG - that’s what we benchmarked, and if you’re using the median comparison for abundance (which is on by default), you deal with compositionality like you would with CLR but retain much more interpretable results.
  2. No - I would still use TSS + LOG for the reasons above. 16S can (and arguably should) always be converted from reads to relative abundances.
1 Like