feature_abd is proportions
Found 8 batches
Adjusting for 1 covariate(s) or covariate(s) level(s)
Pseudo count is not specified and set to half of minimal non-zero value: 9.81e-07
Adjusting for (after filtering) 4663 features
Standardizing data across features
Estimating batch difference parameters and EB priors
Performing shrinkage adjustments on batch difference parameters
Performing batch corrections
Error in if (any(features < 0)) stop("Feature table must be non-negative for normalization!") :
missing value where TRUE/FALSE needed
3: normalize_features(feature_abd_adj, "TSS")
2: diagnostic_adjust_batch(feature_abd = feature_abd, feature_abd_adj = feature_abd_adj,
var_batch = var_batch, gamma_hat = params_fit$gamma_hat,
gamma_star = params_shrinked$gamma_star, output = control$diagnostic_plot)
1: adjust_batch(feature_abd = abundance_matrix, batch = "study",
covariates = "age", data = metadata_df, verbose = T))
I checked that the input does not contain any negative values: any(abundance_matrix < 0)
I’m not sure, where the negative values would be introduced. I didn’t get such an error for any other dataset.
I haven’t been able to dig in to your example data to diagnose this fully yet, but one quick thing to point out is that the error is not coming from negative values, it’s coming from any(features < 0) not evaluating to TRUE or FALSE. I presume it’s evaluating to NA, probably because features contains NA. Maybe check for that?
Okay. I don’t think I’ll be able to step through this in the near future given the size of your data, but the other thing to point out is that the error is occurring in normalize_features() when called on the adjusted data feature_abd_adj. That’s produced by the internal function back_transform_abd() . The extreme sparsity of your data may be causing over/underflow errors in back_transform_abd().
You may want to debug(adjust_batch) and step through the program and see what feature_abd_adj looks like right before the call to diagnostic_adjust_batch().
There’s also a chance the issue is coming from the lack of rownames (taxa identifiers) in your abundance matrix. Try to match the format of your data with that shown in the tutorial.
So I checked again and I found that there is one NA in feature_abd_adj. The NA is introduced for the maximum value of adj_data (1419.805) which results in an overflow (Inf) in back_transform_abd in the first line adj_data <- 2^adj_data. The original abundance at this location is 0.0002204464 which is not the smallest abundance. I don’t fully understand how this large value arises.
The lack of rownames is not an issue here, I did add rownames before running the function.