TSS normalization LOG transformation impact on relative abundance data

Baris_Erhan_Ozdinc · August 13, 2024, 8:44am

When performing default TSS normalization and LOG transformation during Maaslin3 analysis, 0 values are coerced into NAs. Does that have an impact on the overall magnitude difference between signals?

Introducing NAs instead of 0, removes the weakest signal samples from abundance comparison. Here is a comparison of filtered_data, filtered_data_norm and filtered_data_norm_transformed from a dataset where a cohort of preterm infants were intervened with probiotic B. longum and another cohort was not intervened (Nguyen et al 2021). Converting 0 values into NAs led to differences between average magnitudes of *B. longum * relative abundance signals between no normalization-transformation and yes normalization-transformation conditionsSo, here is the question how does Maaslin3 treats 0/NA values when performing relative abundance comparison under no normalization-transformation and yes normalization-transformation conditions? Thank you for your time.

In case you want to have a deeper look at the comparison, please see normalization_transformation_check. In the document, different spreadsheets describes as follows:

filtered_data: all data, untreated

filtered_data_longum: probiotic species data, untreated

filtered_data_norm: TSS normalized data

my_transformation: the data where TSS normalization performed by user, but not Maaslin3

filtered_data_norm_transformed: LOG transformed normalized data

filtered_data_norm_transformed_longum: LOG transformed normalized probiotic data

filtered_data_norm_transformed_longum_clean: LOG transformed normalized probiotic data, no NAs

Metadata: metadata

The magnitude of difference in average B. longum signals between no and yes probiotic cohorts’ comparison between filtered_data_longum and filtered_data_norm_transformed_longum_clean varies.

nearinj · August 14, 2024, 8:56pm

Hi @Baris_Erhan_Ozdinc ,

Thanks for checking out Maaslin3. As you noticed 0 values are treated different in maaslin3 than in maaslin2. In maaslin3 0 values are used in the logistic regression models (prevalence testing) but then discarded in the linear regression models (abundance). This allows maaslin3 to separate out the effect of prevalence from abundance.

Because of this two tiered model system we are able to log the abundance values without a pseudo count since the 0s are discarded anyway in the abundance component.

You can check out the tutorial for maaslin3 here:

thanks,
Jacob Nearing

Topic		Replies	Views
Inconsistency in results in MaAsLin3 when using TSS and input as % relative abundance MaAsLin	1	103	March 22, 2025
Maaslin3 filtering out different features to Maaslin2 Downstream analysis and statistics	2	51	June 2, 2025
MetaPhlAn3 output analysis with MaAsLin2 MaAsLin	2	110	September 25, 2024
CLR normalization and min_abundance in MaAsLin3 MaAsLin	4	142	March 13, 2025
Normalization methods MaAsLin	2	286	March 6, 2024

TSS normalization LOG transformation impact on relative abundance data

Related topics