Error running MaAsLin3 for genus level data only

Hi there!

I am having an error running MaAsLin3 in R only for genus level data. I have run the exact same code with the same metadata with family and phylum level counts and have not had any issue.

Error in `[.data.frame`(new_normalized_data, rownames(transformed_data),  : 
  undefined columns selected

I have double checked that the row names in the metadata and genus taxa data match exactly, and have removed any special characters from the genus names.

Code I’m trying to run:

genus_out <- maaslin3(input_data = gen_table,
                       input_metadata = metadata_mixed3,
                       output = '~/QIIME2/Xenopus/Maaslin_mixed3_genus_output.txt',
                       formula = 'group(Species) + reads_cleaned + (1|Tank)',
                       normalization = 'TSS',
                       transform = 'LOG',
                       augment = TRUE,
                       standardize = TRUE,
                       max_significance = 0.05,
                       min_prevalence = 0.05,
                       median_comparison_abundance = TRUE,
                       median_comparison_prevalence = FALSE,
                       max_pngs = 10,
                       cores = 1)

Similar code for family level that runs perfectly

family_out <- maaslin3(input_data = fam_table,
                      input_metadata = metadata_mixed3,
                      output = '~/QIIME2/Xenopus/Maaslin_mixed3_family_output.txt',
                      formula = 'group(Species) + reads_cleaned + (1|Tank)',
                      normalization = 'TSS',
                      transform = 'LOG',
                      augment = TRUE,
                      standardize = TRUE,
                      max_significance = 0.05,
                      min_prevalence = 0.05,
                      median_comparison_abundance = TRUE,
                      median_comparison_prevalence = FALSE,
                      max_pngs = 10,
                      cores = 1)

Header of the data I’m trying to run

head(metadata_mixed3)
          Tank Species_Treatment Treatment  Mass Body.Length Development Species overall_treatment detailed_treatment SampleID reads_cleaned
IST113 XLXBXT1        XL_Mixed_3   Mixed_3 0.092        9.60          53      XL             Mixed            Mixed_3   IST113         47066
IST114 XLXBXT1        XT_Mixed_3   Mixed_3 0.046        7.43          48      XT             Mixed            Mixed_3   IST114         22594
IST116 XLXBXT1        XT_Mixed_3   Mixed_3 0.041        7.08          48      XT             Mixed            Mixed_3   IST116         32423
IST117 XLXBXT1        XB_Mixed_3   Mixed_3 0.039        7.59          49      XB             Mixed            Mixed_3   IST117         31246
IST118 XLXBXT1        XB_Mixed_3   Mixed_3 0.041        8.24          49      XB             Mixed            Mixed_3   IST118         38745
IST139 XLXBXT2        XL_Mixed_3   Mixed_3 0.110        9.84          53      XL             Mixed            Mixed_3   IST139         14768
> head(gen_table)
       Pseudoxanthobacter 67_14 WPS_2 Bacteroides Chitinivorax Parabacteroides Anaerovorax Acholeplasma Butyricimonas Anaerococcus Erysipelatoclostridium
IST113                  0     0     0        7279            0            2701          89           51             0            0                      0
IST114                  0     0     0        3715            0               0          41            0             0            0                      0
IST116                  0     0     0        6752            0               0         116            0             0            0                      0
IST117                  0     0     0        1306            0             899          56            0             0            0                      0
IST118                  0     0     0        5999            0             438          40            0             0            0                      0
IST139                  0     0     0        2561            0            1265          65            0            10            0                      0

Thanks so much in advance!

Hi,

Nothing seems obviously wrong in the data you posted. Would you be able to send small chunks of the metadata, the family/phylum table that works, and the genus table that doesn’t (willnickols@g.harvard.edu) so I can debug it on my side and figure out what’s wrong? That’ll probably be faster than guessing things over the forum.

In the meantime, is there a reason you’re using the group() strategy in the formula? We haven’t found many cases where this is actually preferable over just including the variable directly, and there might be some edge case with group() that’s causing issues.

Will