When performing default TSS normalization and LOG transformation during Maaslin3 analysis, 0 values are coerced into NAs. Does that have an impact on the overall magnitude difference between signals?
Introducing NAs instead of 0, removes the weakest signal samples from abundance comparison. Here is a comparison of filtered_data, filtered_data_norm and filtered_data_norm_transformed from a dataset where a cohort of preterm infants were intervened with probiotic B. longum and another cohort was not intervened (Nguyen et al 2021). Converting 0 values into NAs led to differences between average magnitudes of *B. longum * relative abundance signals between no normalization-transformation and yes normalization-transformation conditionsSo, here is the question how does Maaslin3 treats 0/NA values when performing relative abundance comparison under no normalization-transformation and yes normalization-transformation conditions? Thank you for your time.
In case you want to have a deeper look at the comparison, please see normalization_transformation_check. In the document, different spreadsheets describes as follows:
filtered_data: all data, untreated
filtered_data_longum: probiotic species data, untreated
filtered_data_norm: TSS normalized data
my_transformation: the data where TSS normalization performed by user, but not Maaslin3
filtered_data_norm_transformed: LOG transformed normalized data
filtered_data_norm_transformed_longum: LOG transformed normalized probiotic data
filtered_data_norm_transformed_longum_clean: LOG transformed normalized probiotic data, no NAs
Metadata: metadata
The magnitude of difference in average B. longum signals between no and yes probiotic cohorts’ comparison between filtered_data_longum and filtered_data_norm_transformed_longum_clean varies.