Maaslin2 output error

Hi Friends,

I have used Maaslin2 to do the association,using two input files, one is taxa_summary file and the other one is metadata file. Before running my actual datasets, I ran the datasets that worked well with Maaslin2 and I got the heatmap with the associations. But, when I run my actual current datasets, it gives me an error something like as follows

[1] “Warning: Deleting existing log file: .//maaslin2.log
2020-09-04 06:09:11 INFO::Writing function arguments to log file
2020-09-04 06:09:11 INFO::Verifying options selected are valid
2020-09-04 06:09:12 INFO::Determining format of input files
2020-09-04 06:09:12 INFO::Input format is data samples as rows and metadata samples as rows
2020-09-04 06:09:12 INFO::Formula for fixed effects: expr ~ Bacteria1 + Bacteria2 + Bacteria3 + Bacteria4 + Bacteria5 + Bacteria6 + Bacteria7 + Bacteria8 + Bacteria9 + Bacteria10 + Bacteria11 + Bacteria12 + Bacteria13 + Bacteria14 + Bacteria15 + Bacteria16 + Bacteria17 + Bacteria18 + Bacteria19 + Bacteria20 + Bacteria21 + Bacteria22 + Bacteria23 + Bacteria24 + Bacteria25 + Bacteria26 + Bacteria27 + Bacteria28 + Bacteria29 + Bacteria30 + Bacteria31 + Bacteria32 + Bacteria33 + Bacteria34 + Bacteria35 + Bacteria36 + Bacteria37 + Bacteria38 + Bacteria39 + Bacteria40 + Bacteria41 + Bacteria42 + Bacteria43 + Bacteria44 + Bacteria45 + Bacteria46 + Bacteria47 + Bacteria48 + Bacteria49 + Bacteria50 + Bacteria51 + Bacteria52 + Bacteria53 + Bacteria54 + Bacteria55 + Bacteria56 + Bacteria57 + Bacteria58 + Bacteria59 + Bacteria60 + Bacteria61 + Bacteria62 + Bacteria63 + Bacteria64
2020-09-04 06:09:12 INFO::Running selected normalization method: TSS
2020-09-04 06:09:12 INFO::Filter data based on min abundance and min prevalence
2020-09-04 06:09:12 INFO::Total samples in data: 30
2020-09-04 06:09:12 INFO::Min samples required with min abundance for a feature not to be filtered: 3.000000
2020-09-04 06:09:12 INFO::Total filtered features: 0
2020-09-04 06:09:12 INFO::Filtered feature names:
2020-09-04 06:09:12 INFO::Applying z-score to standardize continuous metadata
2020-09-04 06:09:12 INFO::Running selected transform method: LOG
2020-09-04 06:09:12 INFO::Running selected analysis method: LM
| | 0 % ~calculating 2020-09-04 06:09:12 INFO::Fitting model to feature number 1, X1
Error in glm.fit(x = structure(numeric(0), .Dim = c(0L, 65L), .Dimnames = list( :
object ‘fit’ not found
In addition: Warning messages:
1: In glm.fit(x = numeric(0), y = numeric(0), weights = NULL, start = NULL, :
no observations informative at iteration 1
2: glm.fit: algorithm did not converge
3: In glm.fit(x = numeric(0), y = numeric(0), weights = NULL, start = NULL, :
no observations informative at iteration 1
4: glm.fit: algorithm did not converge
2020-09-04 06:09:12 WARNING::Fitting problem for feature 1 returning NA
|+ | 2 % ~01s 2020-09-04 06:09:12 INFO::Fitting model to feature number 2, X2
Error in glm.fit(x = structure(numeric(0), .Dim = c(0L, 65L), .Dimnames = list( :
object ‘fit’ not found
In addition: Warning messages:
1: In glm.fit(x = numeric(0), y = numeric(0), weights = NULL, start = NULL, :
no observations informative at iteration 1
2: glm.fit: algorithm did not converge
3: In glm.fit(x = numeric(0), y = numeric(0), weights = NULL, start = NULL, :
no observations informative at iteration 1
4: glm.fit: algorithm did not converge

I checked my files and it is tab separated. It is really frustrating for me. It would be great help if someone can give a suggestion to rectify the issue.

Thanks

Arun

Hi @Adharna - looking at your fixed effects description, you have expr ~ Bacteria1 + Bacteria2 + ... which is an unusual way to run MaAsLin 2 as this is usually flipped (e.g. expr ~ Metadata1 + Metadata1 + ... , where expr = microbiome feature abundance). Can you share a minimum reproducible example and the associated data to be able to debug the error on our end?

Hi @himel.mallick, Thank you for your reply. with this, I have attached my datasets. Please have a look at it and let me know an error from my side.

Thanks
Arun

x_taxa1.tsv (16.1 KB) y_meta1.tsv (9.4 KB)

Could you please share your MaAsLin 2 script that led to the error?

Looking at your metadata file, there seem to be almost as many metadata as the number of taxa, and many of them are all-zero columns. I am not sure MaAsLin 2 is an appropriate tool for more than a few covariates (e.g. >10) as we don’t have a feature selection functionality implemented as of yet.

Maybe, for a better diagnosis of your problem, could you please provide more context about your research question along with a brief description of the datasets and what exactly you are trying to achieve?

Many thanks,
Himel

the scripts are as follows:
R
library(Maaslin2)
input_metadata <- read.table(“x_taxa1.tsv”, sep = “\t”, row.names = 1, header = TRUE)
input_taxadata <- read.table(“y_meta1.tsv”, sep = “\t”, row.names = 1, header = TRUE)
fit_Maaslin2 <- Maaslin2(input_data = input_taxadata, input_metadata = input_metadata, output = “./”)

Actually, I am trying to do the association between the pathways that I obtained from PICRUST analysis and gut microbiome.

Thanks
Arun

Thanks for sharing the code. It looks like your input_data and input_metadata are flipped. They should be exactly the opposite. In other words, x_taxa1.tsv should be supplied as input_data and vice versa.

Hi @himel.mallick, actually, already I tried as you suggested. But the error seems to be the same. I could not find the real issue.

Thanks

As I mentioned earlier, if you have so many metadata to associate, MaAsLin 2 is not the right tool for your purpose. MaAsLin 2 is designed for only a few carefully curated metadata. Your metadata file contains columns with only zeroes which should be removed. Further, if the columns are highly correlated, the default linear model will fail to produce reliable results. So please make sure you carefully curate and retain the most important metadata (possibly less than 10 or so) before running MaAsLin 2. Unfortunately, the curation of metadata for reliable results is beyond the scope of this tool which you must do independently using other tools or analyses.

Thanks @himel.mallick for your suggestion.

Arun