Xtfrm error with Maaslin2 default example in R

Maaslin2 1.13.0

I was running the ready-made example in the function Maaslin2 help page:

input_data <- system.file(
             'extdata','HMP2_taxonomy.tsv', package="Maaslin2")

input_metadata <-system.file(
             'extdata','HMP2_metadata.tsv', package="Maaslin2")

fit_data <- Maaslin2(
             input_data, input_metadata,'demo_output', transform = "AST",
             fixed_effects = c('diagnosis', 'dysbiosisnonIBD','dysbiosisUC','dysbiosisCD', 'antibiotics', 'age'),
             random_effects = c('site', 'subject'),
             normalization = 'NONE',
             reference = 'diagnosis,nonIBD',
             standardize = FALSE)

This leads to the following error:

....
2023-04-28 10:41:54.896843 INFO::Writing heatmap of significant results to file: demo_output/heatmap.pdf
Error in xtfrm.data.frame(x) : cannot xtfrm data frames
In addition: Warning messages:
1: Model failed to converge with 1 negative eigenvalue: -5.6e+00 
2: Model failed to converge with 1 negative eigenvalue: -1.1e+01 
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 0.00291214 (tol = 0.002, component 1)
4: Model failed to converge with 1 negative eigenvalue: -2.1e+02 
5: Model failed to converge with 1 negative eigenvalue: -2.2e+02 

Information my R session:

> sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /home/xxx/bin/R-4.3.0/lib/libRblas.so 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Mariehamn
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] doRNG_1.8.6                     rngtools_1.5.2                 
 [3] foreach_1.5.2                   ANCOMBC_2.2.0                  
 [5] lubridate_1.9.2                 forcats_1.0.0                  
 [7] stringr_1.5.0                   dplyr_1.1.2                    
 [9] purrr_1.0.1                     readr_2.1.4                    
[11] tidyr_1.3.0                     tibble_3.2.1                   
[13] ggplot2_3.4.2                   tidyverse_2.0.0                
[15] knitr_1.42                      MicrobiomeStat_1.1             
[17] Maaslin2_1.13.0                 ALDEx2_1.32.0                  
[19] zCompositions_1.4.0-1           truncnorm_1.0-9                
[21] NADA_1.6-1.1                    survival_3.5-5                 
[23] MASS_7.3-59                     tidySummarizedExperiment_1.10.0
[25] patchwork_1.1.2.9000            mia_1.8.0                      
[27] MultiAssayExperiment_1.26.0     TreeSummarizedExperiment_2.8.0 
[29] Biostrings_2.68.0               XVector_0.40.0                 
[31] SingleCellExperiment_1.22.0     SummarizedExperiment_1.30.0    
[33] Biobase_2.60.0                  GenomicRanges_1.52.0           
[35] GenomeInfoDb_1.36.0             IRanges_2.34.0                 
[37] S4Vectors_0.38.0                BiocGenerics_0.46.0            
[39] MatrixGenerics_1.12.0           matrixStats_0.63.0             
[41] BiocStyle_2.28.0                rebook_1.9.0                   

loaded via a namespace (and not attached):
  [1] bitops_1.0-7                DirichletMultinomial_1.42.0
  [3] doParallel_1.0.17           httr_1.4.5                 
  [5] numDeriv_2016.8-1.1         backports_1.4.1            
  [7] tools_4.3.0                 utf8_1.2.3                 
  [9] R6_2.5.1                    vegan_2.6-4                
 [11] lazyeval_0.2.2              mgcv_1.8-42                
 [13] rhdf5filters_1.12.0         permute_0.9-7              
 [15] withr_2.5.0                 gridExtra_2.3              
 [17] cli_3.6.1.9000              logging_0.10-108           
 [19] biglm_0.9-2.1               sandwich_3.0-2             
 [21] mvtnorm_1.1-3               robustbase_0.95-1          
 [23] pbapply_1.7-0               proxy_0.4-27               
 [25] yulab.utils_0.0.6           foreign_0.8-84             
 [27] scater_1.28.0               decontam_1.20.0            
 [29] readxl_1.4.2                rstudioapi_0.14            
 [31] RSQLite_2.3.1               generics_0.1.3             
 [33] Matrix_1.5-4                biomformat_1.28.0          
 [35] ggbeeswarm_0.7.1            fansi_1.0.4                
 [37] DescTools_0.99.48           DECIPHER_2.28.0            
 [39] lifecycle_1.0.3             multcomp_1.4-23            
 [41] yaml_2.3.7                  rhdf5_2.44.0               
 [43] grid_4.3.0                  blob_1.2.4                 
 [45] crayon_1.5.2                dir.expiry_1.8.0           
 [47] lattice_0.21-8              beachmat_2.16.0            
 [49] CodeDepends_0.6.5           pillar_1.9.0               
 [51] optparse_1.7.3              statip_0.2.3               
 [53] boot_1.3-28.1               gld_2.6.6                  
 [55] estimability_1.4.1          codetools_0.2-19           
 [57] glue_1.6.2                  data.table_1.14.8          
 [59] Rdpack_2.4                  vctrs_0.6.2                
 [61] treeio_1.24.0               cellranger_1.1.0           
 [63] gtable_0.3.3                cachem_1.0.7               
 [65] xfun_0.39                   rbibutils_2.2.13           
 [67] Rfast_2.0.7                 coda_0.19-4                
 [69] pcaPP_2.0-3                 modeest_2.4.0              
 [71] timeDate_4022.108           iterators_1.0.14           
 [73] statmod_1.5.0               gmp_0.7-1                  
 [75] TH.data_1.1-2               ellipsis_0.3.2             
 [77] nlme_3.1-162                phyloseq_1.44.0            
 [79] bit64_4.0.5                 filelock_1.0.2             
 [81] fBasics_4022.94             irlba_2.3.5.1              
 [83] vipor_0.4.5                 rpart_4.1.19               
 [85] colorspace_2.1-0            DBI_1.1.3                  
 [87] Hmisc_5.0-1                 nnet_7.3-18                
 [89] ade4_1.7-22                 Exact_3.2                  
 [91] tidyselect_1.2.0            emmeans_1.8.5              
 [93] timeSeries_4021.105         bit_4.0.5                  
 [95] compiler_4.3.0              graph_1.78.0               
 [97] htmlTable_2.4.1             BiocNeighbors_1.18.0       
 [99] expm_0.999-7                DelayedArray_0.25.0        
[101] plotly_4.10.1               checkmate_2.2.0            
[103] scales_1.2.1                DEoptimR_1.0-12            
[105] spatial_7.3-16              digest_0.6.31              
[107] minqa_1.2.5                 rmarkdown_2.21.3           
[109] base64enc_0.1-3             htmltools_0.5.5            
[111] pkgconfig_2.0.3             lme4_1.1-33                
[113] sparseMatrixStats_1.12.0    lpsymphony_1.28.0          
[115] stabledist_0.7-1            fastmap_1.1.1              
[117] rlang_1.1.0                 htmlwidgets_1.6.2          
[119] DelayedMatrixStats_1.22.0   energy_1.7-11              
[121] zoo_1.8-12                  jsonlite_1.8.4             
[123] BiocParallel_1.34.0         BiocSingular_1.16.0        
[125] RCurl_1.98-1.12             magrittr_2.0.3             
[127] Formula_1.2-5               scuttle_1.10.0             
[129] GenomeInfoDbData_1.2.10     Rhdf5lib_1.22.0            
[131] munsell_0.5.0               Rcpp_1.0.10                
[133] ape_5.7-1                   viridis_0.6.2              
[135] RcppZiggurat_0.1.6          CVXR_1.0-11                
[137] stringi_1.7.12              rootSolve_1.8.2.3          
[139] stable_1.1.6                zlibbioc_1.46.0            
[141] plyr_1.8.8                  parallel_4.3.0             
[143] ggrepel_0.9.3               lmom_2.9                   
[145] splines_4.3.0               hash_2.2.6.2               
[147] multtest_2.56.0             hms_1.1.3                  
[149] igraph_1.4.2                reshape2_1.4.4             
[151] ScaledMatrix_1.7.1          rmutil_1.1.10              
[153] XML_3.99-0.14               evaluate_0.20              
[155] BiocManager_1.30.20         nloptr_2.0.3               
[157] tzdb_0.3.0                  getopt_1.20.3              
[159] clue_0.3-64                 rsvd_1.0.5                 
[161] xtable_1.8-4                Rmpfr_0.9-2                
[163] e1071_1.7-13                tidytree_0.4.2             
[165] viridisLite_0.4.1           class_7.3-21               
[167] gsl_2.1-8                   lmerTest_3.1-3             
[169] memoise_2.0.1               beeswarm_0.4.0             
[171] cluster_2.1.4               timechange_0.2.0   

This should be fixed in the development version, try remotes::install_github("biobakery/Maaslin2")

Yes it works. Thank you!

Hello,

I am still having this same issue during the same step as mentioned above (as the program is trying to write the heatmaps to file):

Error in xtfrm.data.frame(x) : cannot xtfrm data frames

R version: 4.3.0
Maaslin2 version: 1.7.3 downloaded from your github recommendation above

Hi there,

Could you post an R session info, along with the command you are trying to run and the command you used to install the updated Maaslin2.

Thanks,
Jacob

Hi Jacob,

The command I used to install:
remotes::install_github(β€œbiobakery/Maaslin2”)

session info:

sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=en_CA.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

time zone: America/Toronto
tzcode source: system (glibc)

attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] Maaslin2_1.7.3 EnhancedVolcano_1.18.0
[3] BiocManager_1.30.20 doRNG_1.8.6
[5] rngtools_1.5.2 foreach_1.5.2
[7] mia_1.8.0 MultiAssayExperiment_1.26.0
[9] TreeSummarizedExperiment_2.8.0 Biostrings_2.68.0
[11] XVector_0.40.0 SingleCellExperiment_1.22.0
[13] SummarizedExperiment_1.30.1 Biobase_2.60.0
[15] GenomicRanges_1.52.0 GenomeInfoDb_1.36.0
[17] IRanges_2.34.0 S4Vectors_0.38.1
[19] BiocGenerics_0.46.0 MatrixGenerics_1.12.0
[21] matrixStats_0.63.0 microbiomeMarker_1.6.0
[23] ALDEx2_1.32.0 zCompositions_1.4.0-1
[25] truncnorm_1.0-9 NADA_1.6-1.1
[27] survival_3.5-5 MASS_7.3-59
[29] ANCOMBC_2.2.0 microViz_0.10.8
[31] car_3.1-2 carData_3.0-5
[33] kableExtra_1.3.4 knitr_1.40
[35] DT_0.27 lubridate_1.9.2
[37] forcats_1.0.0 stringr_1.5.0
[39] dplyr_1.1.2 purrr_1.0.1
[41] readr_2.1.4 tidyr_1.3.0
[43] tibble_3.2.1 tidyverse_2.0.0
[45] gridExtra_2.3 ape_5.7-1
[47] reshape2_1.4.4 scales_1.2.1
[49] plyr_1.8.8 ggrepel_0.9.3
[51] microbiome_1.22.0 pairwiseAdonis_0.4.1
[53] cluster_2.1.4 data.table_1.14.8
[55] biomformat_1.28.0 phyloseq_1.44.0
[57] vegan_2.6-4 lattice_0.21-8
[59] permute_0.9-7 ggplot2_3.4.2

loaded via a namespace (and not attached):
[1] gld_2.6.6 nnet_7.3-18
[3] TH.data_1.1-2 vctrs_0.6.2
[5] energy_1.7-11 digest_0.6.29
[7] png_0.1-8 shape_1.4.6
[9] proxy_0.4-27 pcaPP_2.0-3
[11] Exact_3.2 registry_0.5-1
[13] withr_2.5.0 xfun_0.39
[15] ggfun_0.0.9 memoise_2.0.1
[17] commonmark_1.9.0 ggbeeswarm_0.7.2
[19] emmeans_1.8.5 gmp_0.6-6
[21] systemfonts_1.0.4 gtools_3.9.4
[23] tidytree_0.4.2 zoo_1.8-12
[25] GlobalOptions_0.1.2 pbapply_1.7-0
[27] logging_0.10-108 DEoptimR_1.0-13
[29] prettyunits_1.1.1 Formula_1.2-5
[31] httr_1.4.6 hash_2.2.6.2
[33] rhdf5filters_1.12.1 ps_1.7.1
[35] rhdf5_2.44.0 rstudioapi_0.14
[37] generics_0.1.3 processx_3.7.0
[39] base64enc_0.1-3 curl_4.3.3
[41] zlibbioc_1.46.0 ScaledMatrix_1.8.1
[43] ca_0.71.1 RcppZiggurat_0.1.6
[45] GenomeInfoDbData_1.2.10 xtable_1.8-4
[47] ade4_1.7-22 doParallel_1.0.17
[49] evaluate_0.17 S4Arrays_1.0.1
[51] Rfast_2.0.7 hms_1.1.3
[53] glmnet_4.1-7 irlba_2.3.5.1
[55] colorspace_2.1-0 getopt_1.20.3
[57] metagMisc_0.5.0 readxl_1.4.2
[59] magrittr_2.0.3 viridis_0.6.3
[61] ggtree_3.8.0 robustbase_0.95-1
[63] DECIPHER_2.28.0 cplm_0.7-11
[65] scuttle_1.10.1 class_7.3-21
[67] Hmisc_5.1-0 pillar_1.9.0
[69] nlme_3.1-162 iterators_1.0.14
[71] decontam_1.20.0 plotROC_2.3.0
[73] caTools_1.18.2 compiler_4.3.0
[75] beachmat_2.16.0 stringi_1.7.8
[77] TSP_1.2-4 DescTools_0.99.48
[79] minqa_1.2.5 crayon_1.5.2
[81] abind_1.4-5 scater_1.28.0
[83] gridGraphics_0.5-1 ggtext_0.1.2
[85] locfit_1.5-9.7 bit_4.0.4
[87] biglm_0.9-2.1 rootSolve_1.8.2.3
[89] sandwich_3.0-2 codetools_0.2-19
[91] multcomp_1.4-23 BiocSingular_1.16.0
[93] crosstalk_1.2.0 bslib_0.4.0
[95] e1071_1.7-13 lmom_2.9
[97] GetoptLong_1.0.5 multtest_2.56.0
[99] splines_4.3.0 metagenomeSeq_1.42.0
[101] markdown_1.6 circlize_0.4.15
[103] Rcpp_1.0.9 sparseMatrixStats_1.12.0
[105] cellranger_1.1.0 gridtext_0.1.5
[107] blob_1.2.4 utf8_1.2.2
[109] clue_0.3-64 lme4_1.1-33
[111] checkmate_2.2.0 DelayedMatrixStats_1.22.0
[113] Rdpack_2.4 pkgbuild_1.3.1
[115] expm_0.999-7 gsl_2.1-8
[117] ggplotify_0.1.0 estimability_1.4.1
[119] Matrix_1.5-1 statmod_1.5.0
[121] callr_3.7.3 tzdb_0.3.0
[123] svglite_2.1.1 pkgconfig_2.0.3
[125] tools_4.3.0 cachem_1.0.6
[127] tweedie_2.3.5 rbibutils_2.2.13
[129] RSQLite_2.3.1 viridisLite_0.4.2
[131] rvest_1.0.3 DBI_1.1.3
[133] numDeriv_2016.8-1.1 fastmap_1.1.0
[135] rmarkdown_2.17 grid_4.3.0
[137] sass_0.4.2 patchwork_1.1.2
[139] coda_0.19-4 rpart_4.1.19
[141] farver_2.1.1 mgcv_1.8-42
[143] yaml_2.3.5 foreign_0.8-82
[145] cli_3.6.1 webshot_0.5.4
[147] lifecycle_1.0.3 mvtnorm_1.1-3
[149] backports_1.4.1 BiocParallel_1.34.1
[151] timechange_0.2.0 gtable_0.3.3
[153] rjson_0.2.21 limma_3.56.1
[155] CVXR_1.0-10 jsonlite_1.8.4
[157] seriation_1.4.2 bitops_1.0-7
[159] bit64_4.0.5 Rtsne_0.16
[161] yulab.utils_0.0.6 BiocNeighbors_1.18.0
[163] jquerylib_0.1.4 highr_0.9
[165] lazyeval_0.2.2 htmltools_0.5.3
[167] glue_1.6.2 optparse_1.7.3
[169] Wrench_1.18.0 RCurl_1.98-1.12
[171] rprojroot_2.0.3 treeio_1.24.0
[173] boot_1.3-28 igraph_1.4.2
[175] R6_2.5.1 DESeq2_1.40.1
[177] gplots_3.1.3 Rmpfr_0.8-9
[179] labeling_0.4.2 Rhdf5lib_1.22.0
[181] aplot_0.1.10 nloptr_2.0.3
[183] DirichletMultinomial_1.42.0 DelayedArray_0.26.2
[185] tidyselect_1.2.0 vipor_0.4.5
[187] htmlTable_2.4.1 xml2_1.3.3
[189] KernSmooth_2.23-20 rsvd_1.0.5
[191] munsell_0.5.0 htmlwidgets_1.5.4
[193] ComplexHeatmap_2.16.0 RColorBrewer_1.1-3
[195] rlang_1.1.1 remotes_2.4.2
[197] lmerTest_3.1-3 lpsymphony_1.28.0
[199] Cairo_1.6-0 fansi_1.0.3
[201] beeswarm_0.4.0

Command:
fit_data_lm ← Maaslin2(
df_input_data,
df_input_metadata,
output = β€œ/home/ROutput/Maaslin_LM_default”,
min_abundance = 0.0,
min_prevalence = 0.1,
min_variance = 0.0,
normalization = β€œTSS”,
transform = β€œLOG”,
analysis_method = β€œLM”,
max_significance = 0.1,
random_effects = NULL,
fixed_effects = c(β€˜SampleType’),
correction = β€œBH”,
standardize = TRUE,
cores = 1,
plot_heatmap = TRUE,
plot_scatter = TRUE,
heatmap_first_n = 50,
reference = c(β€œSampleType,Mouse”)
)

I should also note: this was all working perfectly prior to my update to Ubuntu 22.04.2 LTS and R 4.3.0, with my previously downloaded version of Maaslin2, installed using the command:

if (!require(β€œBiocManager”, quietly = TRUE))
install.packages(β€œBiocManager”)

BiocManager::install(β€œMaaslin2”)

Thanks!
Julia

Hi Julia,

Thanks for the update. Sorry I didn’t get back to you right away. Can you try re-running your install command with:

remotes::install_github(repo="biobakery/Maaslin2", force=TRUE)

Thanks,
Jacob Nearing

Hi Jacob,

I re-ran the install with your command:
remotes::install_github(repo="biobakery/Maaslin2", force=TRUE)

Unfortunately the error persists.

Thanks,
Julia

Hi Julia,

Sorry to hear that didn’t fix your issue. Can you try running these three commands:

remove.packages("Maaslin2")
purge("Maaslin2")
remotes::install_github(repo="biobakery/Maaslin2", force=TRUE)

Once this is done can you then go ahead and restart your R session and computer.

If this still doesn’t work can you then post the results of:

sessioninfo::session_info(pkgs = "Maaslin2", dependencies = FALSE)

Thanks,
Jacob Nearing

Hi Jacob,

Great! The update worked. Thanks for all your help!

Julia

Hi Jacob

I’m having the same issue as above.

> fit_data <- Maaslin2(input_data = as.data.frame(otu_table(cervic_physeq_agg)), 
+                                  input_metadata = as.data.frame(sample_data(cervic_physeq_agg)),
+                                  output = "maaslin2_taxa_output_20231001", 
+                                  fixed_effects = c("cervic_substudy"), min_prevalence = 0.10, 
+                                  normalization = "TSS", transform="LOG", analysis_method = "LM", standardize = FALSE, 
+                                  reference = 'cervic_substudy,`control`', plot_heatmap = TRUE)

2023-10-01 21:03:28.909931 INFO::Writing function arguments to log file
2023-10-01 21:03:28.916066 INFO::Verifying options selected are valid
2023-10-01 21:03:28.917001 INFO::Determining format of input files
2023-10-01 21:03:28.917885 INFO::Input format is data samples as columns and metadata samples as rows
2023-10-01 21:03:28.927558 INFO::Formula for fixed effects: expr ~  cervic_substudy

Error in xtfrm.data.frame(x) : cannot xtfrm data frames

I have followed the advice above but the error still persists. Any ideas how I can overcome the issue? Session info below.

Thanks for your advice,
Erica

> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.5.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Melbourne
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DT_0.29           microbiome_1.22.0 lubridate_1.9.3   forcats_1.0.0     stringr_1.5.0     dplyr_1.1.3       purrr_1.0.2      
 [8] readr_2.1.4       tidyr_1.3.0       tibble_3.2.1      ggplot2_3.4.3     tidyverse_2.0.0   qwraps2_0.5.2     TMB_1.9.6        
[15] Maaslin2_1.15.1   phyloseq_1.44.0  

loaded via a namespace (and not attached):
 [1] DBI_1.1.3               bitops_1.0-7            remotes_2.4.2.1         permute_0.9-7           rlang_1.1.1            
 [6] magrittr_2.0.3          ade4_1.7-22             compiler_4.3.1          mgcv_1.9-0              callr_3.7.3            
[11] vctrs_0.6.3             reshape2_1.4.4          fastmap_1.1.1           pkgconfig_2.0.3         crayon_1.5.2           
[16] XVector_0.40.0          utf8_1.2.3              tzdb_0.4.0              ps_1.7.5                zlibbioc_1.46.0        
[21] GenomeInfoDb_1.36.3     jsonlite_1.8.7          biomformat_1.28.0       rhdf5filters_1.12.1     Rhdf5lib_1.22.1        
[26] parallel_4.3.1          prettyunits_1.2.0       cluster_2.1.4           R6_2.5.1                biglm_0.9-2.1          
[31] stringi_1.7.12          Rcpp_1.0.11             iterators_1.0.14        IRanges_2.34.1          timechange_0.2.0       
[36] Matrix_1.5-4            splines_4.3.1           igraph_1.5.1            tidyselect_1.2.0        rstudioapi_0.15.0      
[41] vegan_2.6-4             codetools_0.2-19        curl_5.0.2              processx_3.8.2          pkgbuild_1.4.2         
[46] lattice_0.21-8          plyr_1.8.8              Biobase_2.60.0          withr_2.5.1             Rtsne_0.16             
[51] desc_1.4.2              survival_3.5-7          getopt_1.20.4           Biostrings_2.68.1       pillar_1.9.0           
[56] BiocManager_1.30.22     foreach_1.5.2           stats4_4.3.1            pcaPP_2.0-3             generics_0.1.3         
[61] rprojroot_2.0.3         RCurl_1.98-1.12         S4Vectors_0.38.2        hms_1.1.3               munsell_0.5.0          
[66] scales_1.2.1            glue_1.6.2              tools_4.3.1             robustbase_0.99-0       data.table_1.14.8      
[71] mvtnorm_1.2-3           rhdf5_2.44.0            grid_4.3.1              optparse_1.7.3          ape_5.7-1              
[76] colorspace_2.1-0        nlme_3.1-163            GenomeInfoDbData_1.2.10 cli_3.6.1               fansi_1.0.4            
[81] gtable_0.3.4            DEoptimR_1.1-2          logging_0.10-108        hash_2.2.6.3            digest_0.6.33          
[86] BiocGenerics_0.46.0     htmlwidgets_1.6.2       htmltools_0.5.6         multtest_2.56.0         lifecycle_1.0.3        
[91] MASS_7.3-60
1 Like

Hello,

Could you run this command so I can see the exact place Maaslin2 was pulled from during your install.

sessioninfo::session_info(pkgs = "Maaslin2", dependencies = FALSE)

Cheers,
Jacob Nearing

Hi Jacob,

Thanks for getting back to me.

Please see requested output below.

Thanks again for your help.

Best regards
Erica

> sessioninfo::session_info(pkgs = "Maaslin2", dependencies = FALSE)
─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       macOS Ventura 13.5.1
 system   x86_64, darwin20
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Australia/Melbourne
 date     2023-10-03
 rstudio  2023.09.0+463 Desert Sunflower (desktop)
 pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)

─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package  * version date (UTC) lib source
 Maaslin2 * 1.15.1  2023-10-01 [1] Github (biobakery/Maaslin2@550f3d1)

 [1] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library

Thanks for that!

Can you also post the result of:

print(Maaslin2:::maaslin2_heatmap)

Just trying to track down if this is the same bug as others had encountered

Cheers,
Jacob Nearing

Thanks! Please see below.
Best regards

Erica

> print(Maaslin2:::maaslin2_heatmap)
function (output_results, title = NA, cell_value = "qval", data_label = "data", 
    metadata_label = "metadata", border_color = "grey93", color = colorRampPalette(c("darkblue", 
        "grey90", "darkred")), col_rotate = 90, first_n = 50) 
{
    df <- read.table(output_results, header = TRUE, sep = "\t", 
        fill = TRUE, comment.char = "", check.names = FALSE)
    title_additional <- ""
    title_additional <- ""
    if (!is.na(first_n) & first_n > 0 & first_n < dim(df)[1]) {
        if (cell_value == "coef") {
            df <- df[order(-abs(df[[cell_value]])), ]
        }
        else {
            df <- df[order(df[[cell_value]]), ]
        }
        df_sub <- df[1:first_n, ]
        for (first_n_index in seq(first_n, dim(df)[1])) {
            if (length(unique(df_sub$feature)) == first_n) {
                break
            }
            df_sub <- df[1:first_n_index, ]
        }
        df <- df[which(df$feature %in% df_sub$feature), ]
        title_additional <- paste("Top", first_n, sep = " ")
    }
    if (dim(df)[1] < 2) {
        print("There are no associations to plot!")
        return(NULL)
    }
    metadata <- df$metadata
    data <- df$feature
    dfvalue <- df$value
    value <- NA
    if (cell_value == "pval") {
        value <- -log(df$pval) * sign(df$coef)
        value <- pmax(-20, pmin(20, value))
        if (is.null(title)) 
            title <- "(-log(pval)*sign(coeff))"
    }
    else if (cell_value == "qval") {
        value <- -log(df$qval) * sign(df$coef)
        value <- pmax(-20, pmin(20, value))
        if (is.null(title)) 
            title <- "(-log(qval)*sign(coeff))"
    }
    else if (cell_value == "coef") {
        value <- df$coef
        if (is.null(title)) 
            title <- "(coeff)"
    }
    if (title_additional != "") {
        title <- paste(title_additional, "features with significant associations", 
            title, sep = " ")
    }
    else {
        title <- paste("Significant associations", title, sep = " ")
    }
    verbose_metadata <- c()
    metadata_multi_level <- c()
    for (i in unique(metadata)) {
        levels <- unique(df$value[df$metadata == i])
        if (length(levels) > 1) {
            metadata_multi_level <- c(metadata_multi_level, i)
            for (j in levels) {
                verbose_metadata <- c(verbose_metadata, paste(i, 
                  j))
            }
        }
        else {
            verbose_metadata <- c(verbose_metadata, i)
        }
    }
    n <- length(unique(data))
    m <- length(unique(verbose_metadata))
    if (n < 2) {
        print(paste("There is not enough features in the associations", 
            "to create a heatmap plot.", "Please review the associations in text output file."))
        return(NULL)
    }
    if (m < 2) {
        print(paste("There is not enough metadata in the associations", 
            "to create a heatmap plot.", "Please review the associations in text output file."))
        return(NULL)
    }
    a = matrix(0, nrow = n, ncol = m)
    a <- as.data.frame(a)
    rownames(a) <- unique(data)
    colnames(a) <- unique(verbose_metadata)
    for (i in seq_len(dim(df)[1])) {
        current_metadata <- metadata[i]
        if (current_metadata %in% metadata_multi_level) {
            current_metadata <- paste(metadata[i], dfvalue[i])
        }
        if (abs(a[as.character(data[i]), as.character(current_metadata)]) > 
            abs(value[i])) 
            next
        a[as.character(data[i]), as.character(current_metadata)] <- value[i]
    }
    max_value <- ceiling(max(a))
    min_value <- ceiling(min(a))
    range_value <- max(c(abs(max_value), abs(min_value)))
    breaks <- seq(-1 * range_value, range_value, by = 1)
    p <- NULL
    tryCatch({
        p <- pheatmap::pheatmap(a, cellwidth = 5, cellheight = 5, 
            main = title, fontsize = 6, kmeans_k = NA, border = TRUE, 
            show_rownames = TRUE, show_colnames = TRUE, scale = "none", 
            cluster_rows = FALSE, cluster_cols = TRUE, clustering_distance_rows = "euclidean", 
            clustering_distance_cols = "euclidean", legend = TRUE, 
            border_color = border_color, color = color(range_value * 
                2), breaks = breaks, treeheight_row = 0, treeheight_col = 0, 
            display_numbers = matrix(ifelse(a > 0, "+", ifelse(a < 
                0, "-", "")), nrow(a)), silent = TRUE)
    }, error = function(err) {
        logging::logerror("Unable to plot heatmap")
        logging::logerror(err)
    })
    return(p)
}
<bytecode: 0x7faf7dcae1c0>
<environment: namespace:Maaslin2>

Hi there @ericap

The good news is that the issue you are facing is not the same bug as others had reported in this thread. The bad news is there seems to be something else going on here…

Looking at your Maaslin2 call is there a reason why you put backticks around the term control? Perhaps coding the variable cervic_substudy as characters (rather than factors) might resolve this issue.

Would it be possible to send us some subset of the data that reproduces this bug so we can figure out where the problematic part of our codebase is in this case?

Cheers,
Jacob Nearing

I had the same error but was able to bypass it when I wrote the input_metadata and input_data into tab delimited files, and use the file path as input parameters in the maaslin2 code. something like below:

  fit_data = Maaslin2(
    input_data="C:/Users/xxx/Box/input_Genus.txt", 
    input_metadata="C:/Users/xxx/Box/input_meta.txt",
    output = "Genus_out",
    min_prevalence=0.2,
    max_significance = 0.05,
    fixed_effects = c("Age","Sex","Platform"),
         transform = "NONE",
    normalization = "TMM",
    analysis_method="NEGBIN")
2 Likes

Hi @nearinj.

Apologies for my delay in getting back to you, I got wrapped up in some other work.

The previous code I posted worked fine prior to updating to R 4.3.1, but replacing as.data.frame(otu_table(cervic_physeq_filtered) with data.frame(otu_table(cervic_physeq_filtered) fixed the error.

The below works:

fit_data_taxa_output <- Maaslin2(input_data = data.frame(otu_table(cervic_physeq_filtered)), 
                                 input_metadata = data.frame(sample_data(cervic_physeq_filtered)),
                                 output = "maaslin2_taxa_output_0.1_20231103", 
                                 fixed_effects = c("cervic_substudy", "sequence_run"), min_prevalence = 0.10, 
                                 normalization = "TSS", transform="LOG", analysis_method = "LM", standardize = FALSE, 
                                 reference = 'cervic_substudy,control;sequence_run,run1', plot_heatmap = TRUE)

@Keren’s solution of writing the data to tab delimited files also worked.

Thanks a lot for your help troubleshooting this.

Erica

2 Likes