Including random effect causes R session to abort

I am using maaslin3 v‘0.99.1’, and am able to run the analysis when including only Treatment as a fixed effect, but when I include CageID as a random effect, Rstudio throws an error “R session aborted R encountered a fatal error.”

ref<-"Treatment,CD"
fe<-"Treatment"
re<-"CageID"  ##this causes fatal error
re<-NULL  ##this works

fit_data = maaslin3(input_data     = df.mas, 
                    input_metadata   = data, 
                    min_prevalence   = 0.01,
                    normalization    = "NONE",
                    standardize      = FALSE,
                    transform        = "LOG",
                    output           = "test", 
                    fixed_effects    = fe,
                    random_effects   = re,
                    reference        = ref,
                    summary_plot_first_n = 100
                    )

maaslin3.log:
2024-12-18 16:19:02.74499 INFO::Writing function arguments to log file
2024-12-18 16:19:02.747998 DEBUG::Function arguments
2024-12-18 16:19:02.748952 DEBUG::Output folder: test
2024-12-18 16:19:02.750032 DEBUG::Formula:
2024-12-18 16:19:02.751002 DEBUG::Fixed effects: Treatment
2024-12-18 16:19:02.75177 DEBUG::Reference: Treatment,CD
2024-12-18 16:19:02.752415 DEBUG::Random effects: CageID
2024-12-18 16:19:02.753088 DEBUG::Group effects:
2024-12-18 16:19:02.753731 DEBUG::Ordered effects:
2024-12-18 16:19:02.754375 DEBUG::Strata effects:
2024-12-18 16:19:02.755014 DEBUG::Min Abundance: 0.000000
2024-12-18 16:19:02.755664 DEBUG::Min Prevalence: 0.010000
2024-12-18 16:19:02.756399 DEBUG::Zero Threshold: 0.000000
2024-12-18 16:19:02.757051 DEBUG::Min variance: 0.000000
2024-12-18 16:19:02.757669 DEBUG::Max significance: 0.100000
2024-12-18 16:19:02.75826 DEBUG::Normalization: NONE
2024-12-18 16:19:02.758895 DEBUG::Transform: LOG
2024-12-18 16:19:02.759509 DEBUG::Correction method: BH
2024-12-18 16:19:02.760107 DEBUG::Standardize: FALSE
2024-12-18 16:19:02.760718 DEBUG::Abundance median comparison: TRUE
2024-12-18 16:19:02.761307 DEBUG::Prevalence median comparison: FALSE
2024-12-18 16:19:02.761931 DEBUG::Abundance median comparison threshold: 0
2024-12-18 16:19:02.762536 DEBUG::Prevalence median comparison threshold: 0
2024-12-18 16:19:02.763127 DEBUG::Subtract median: FALSE
2024-12-18 16:19:02.763725 DEBUG::Warn prevalence: TRUE
2024-12-18 16:19:02.764314 DEBUG::Augment: TRUE
2024-12-18 16:19:02.764923 DEBUG::Evaluate only:
2024-12-18 16:19:02.765533 DEBUG::Cores: 1
2024-12-18 16:19:02.766124 DEBUG::Balanced Summary plot: FALSE
2024-12-18 16:19:02.766746 INFO::Verifying options selected are valid
2024-12-18 16:19:02.768457 INFO::Determining format of input files
2024-12-18 16:19:02.769899 INFO::Input format is data samples as columns and metadata samples as rows
2024-12-18 16:19:02.771673 DEBUG::Transformed data so samples are rows
2024-12-18 16:19:02.772453 DEBUG::A total of 34 samples were found in both the data and metadata
2024-12-18 16:19:02.773106 DEBUG::Reordering data/metadata to use same sample ordering
2024-12-18 16:19:02.774253 INFO::Formula for random effects: expr ~ (1 | CageID)
2024-12-18 16:19:02.775769 INFO::Formula for fixed effects: expr ~ Treatment
2024-12-18 16:19:02.777551 INFO::Running selected normalization method: NONE
2024-12-18 16:19:02.77962 INFO::Writing normalized data to file test/features/data_norm.tsv
2024-12-18 16:19:02.782992 INFO::Filter data based on min abundance and min prevalence
2024-12-18 16:19:02.78432 INFO::Total samples in data: 34
2024-12-18 16:19:02.785696 INFO::Min samples required with min abundance for a feature not to be filtered: 0.340000
2024-12-18 16:19:02.790075 INFO::Total filtered features: 0
2024-12-18 16:19:02.791567 INFO::Filtered feature names from abundance and prevalence filtering:
2024-12-18 16:19:02.793491 INFO::Total features filtered by non-zero variance filtering: 0
2024-12-18 16:19:02.794836 INFO::Filtered feature names from variance filtering:
2024-12-18 16:19:02.79619 INFO::Writing filtered data to file test/features/filtered_data.tsv
2024-12-18 16:19:02.799489 INFO::Running selected transform method: LOG
2024-12-18 16:19:02.801422 INFO::Writing normalized, filtered, transformed data to file test/features/data_transformed.tsv
2024-12-18 16:19:02.804887 INFO::Bypass z-score application to metadata
2024-12-18 16:19:02.80621 INFO::Running the linear model component
2024-12-18 16:19:03.091519 INFO::Fitting model to feature number 1, p__Bacteroidetes.f__Bacteroidaceae

Hi,

It seems the random intercept is causing your machine to run out of memory. How big is your dataset? Also, have you made sure the random effect is a factor or character and not something else? If those both seem fine, have you tried running the random effect from the tutorial? Does that work for you?

Will

Hi Will,

I don’t beleive it is a memory issue, as I have tried with a smaller dataset and monitored the memory usage (128Gb total), and did not see any significant change, although the crash still occurs. The random effect is a factor in the dataframe. I have also now tried with the tutorial data and it also causes Rstudio to crash when including the random effect as in this section. Any suggestions are appreciated.

Thanks,

Dion

Hi Dion,

If I understand your memory usage, 128 Gb seems really high (MaAsLin 3 should be taking less than ~4 Gb unless you’re parallelizing heavily). Still, if the tutorial data is causing the crash as well, maybe there’s some other issue. The tutorial data works on my computer (8 Gb RAM) as well as Linux, MacOS, and Windows in our latest automatic GitHub actions checks, so I’m pretty sure the tutorial should work.

I wonder if there’s something strange with your R or lme4 installation. If you’re familiar with Conda, can you try creating a new Conda environment, installing R, installing MaAsLin 3, and then running the tutorial code? If you have access to more than one computer (or local vs. remote compute), can you try running the tutorial code on a different system and seeing if that works?

Will

Hi Will,

I meant that I have 128Gb RAM installed on my system, not usage - sorry for the confusion. I don’t recall the exact usage, but it was a small fraction of the total. I will try creating a new environment and let you know if that fixes it.

Thanks,

Dion

Hi Will,

Setting up a fresh conda environment did indeed work, I can now run with random effects using both tutorial and my own datasets. Not sure what the cause was, although just before it crashed, I did see an error in the maaslin output that I hadn’t seen before indicating an incompatibility between lme4 and Matrix:

Warning: ABI version mismatch: lme4 was built with Matrix ABI version 1 Current Matrix ABI version is 2

Thanks again for the help!

Dion

Strange issue, but glad it’s working now!