Incorrect name/value in output

My metadata contains levels of a variable called “Txt_Dose” that are “VEH_10”, “VEH_12”, “VEH_14”, “VEH_16”, and “MOR_10”, “MOR_12”, “MOR_14”, “MOR_16”. When running the following analysis, my name and value outputs come out as Txt_Dose2, Txt_Dose3, etc… Why is it not using the levels of my factor?

fit_dam4_10 = Maaslin2(
input_data = taxonomy_4_dams,
input_metadata = metadata_mom_2,
min_prevalence = 0.1,
min_abundance = 0.0001,
max_significance = 0.05,
normalization = “TSS”,
output = “v4dams_output_10”,
fixed_effects = c(“Txt_Dose”),
reference = c(“Txt_Dose,VEH_10”)
)

Hi @hharder,

I have never seen this type of issue before out of MaAsLin. Do you mind sending some additional information to help us troubleshoot this behavior?

  1. R version
  2. MaAsLin 2 version
  3. Minimally reproducible data (either your data attached here or emailed to me or dummy data that reproduces this behavior)

Best,
Kelsey

I’m running R 4.2.1 and Maaslin2 1.10.0.
taxonomy_4_dams.csv (198.5 KB)
metadata_4_dam.csv (3.0 KB)

I’ve attached my data, and below is the full code.

fit_dam4_10 = Maaslin2(
  input_data = taxonomy_4_dams,
  input_metadata = metadata_mom_2,
  min_prevalence = 0.1,
  min_abundance = 0.0001,
  max_significance = 0.05,
  normalization = "TSS", 
  output = "v4dams_output_10",
  fixed_effects = c("Txt_Dose"),
  reference = c("Txt_Dose,VEH_10")
)
fit_dam4_12 = Maaslin2(
  input_data = taxonomy_4_dams,
  input_metadata = metadata_4_dam,
  min_prevalence = 0.1,
  min_abundance = 0.0001,
  max_significance = 0.05,
  normalization = "TSS", 
  output = "v4dams_output_12",
  fixed_effects = c("Txt_Dose"),
  reference = c("Txt_Dose,VEH_12")
)
fit_dam4_14 = Maaslin2(
  input_data = taxonomy_4_dams,
  input_metadata = metadata_4_dam,
  min_prevalence = 0.1,
  min_abundance = 0.0001,
  max_significance = 0.05,
  normalization = "TSS", 
  output = "v4dams_output_14",
  fixed_effects = c("Txt_Dose"),
  reference = c("Txt_Dose,VEH_14")
)
fit_dam4_16 = Maaslin2(
  input_data = taxonomy_4_dams,
  input_metadata = metadata_4_dam,
  min_prevalence = 0.1,
  min_abundance = 0.0001,
  max_significance = 0.05,
  normalization = "TSS", 
  output = "v4dams_output_16",
  fixed_effects = c("Txt_Dose"),
  reference = c("Txt_Dose,VEH_16")
)
dam4_10_results <- fit_dam4_10$results
dam4_12_results <- fit_dam4_12$results
dam4_14_results <- fit_dam4_14$results
dam4_16_results <- fit_dam4_16$results
dam4_10_results$ref <- c(rep("VEH_10", times = 2051))
dam4_12_results$ref <- c(rep("VEH_12", times = 2051))
dam4_14_results$ref <- c(rep("VEH_14", times = 2051))
dam4_16_results$ref <- c(rep("VEH_16", times = 2051))
dam4_results_combined <- rbind(dam4_10_results, dam4_12_results, dam4_14_results, dam4_16_results)
dam4_results_combined <- dam4_results_combined %>%
  dplyr::select(feature, ref, value, coef, stderr, pval)
dam4_results_combined$qval <- p.adjust(dam4_results_combined$pval, method = "BH")
write.table(dam4_results_combined, file = "C:\\Users\\hharder1\\Documents\\dam4_results_combined.csv", sep=",", row.names = FALSE)

Here is a screenshot of the problem:

The value column should list the levels of Txt_Dose from the metadata file, but instead just returns 2-7. If I was 100% which level was represented by each value, that would work, but I’m not sure.

I have done some more testing on this problem, and still have been unable to solve it. It seems to be something about the data itself, as when I rerun the analysis using only “Txt” as a fixed effect rather than “Txt_Dose”, I get the same problem - output returns values of 1 instead of “MOR”. I’m not getting any errors or any strange responses in the log file.

I’ve just managed to fix it - apparently it was the hyphen in the column name “Txt_Dose” that was messing it up. By converting it to “TxtDose”, it started working. Maaslin doesn’t seem to mind that the levels have hyphens in them though.

Hi @hharder,

@andrewGhazi from our group did a test on the data you uploaded and did not replicate this issue. I misread your original post, this is something we have seen before. It has to do with an R bug in the way that we were handling the reference in older code versions (which is unfortunately still the bioconductor version - we are working on pushing the newer code now). He ran MaAsLin with the github version instead of the older Bioconductor release.

If you re-install MaAsLin from github* it should solve this bug. Apologies for the confusion!
Code:

install.packages("devtools")
library("devtools")
install_github("biobakery/Maaslin2")

*We do not normally recommend installing MaAsLin in this fashion.

Best,
Kelsey