Hi! I’m trying to simulate microbial data with my own specified correlation matrix. I was thinking that I could simulate data using the “Stool” template, fit the data, overwrite the appropriate matrix, and then use the fitted data to resimulate some data, now with my specified correlation matrix. My first question is, will this work or is there an easier way to do what I’m hoping to accomplish? If so, what’s the appropriate matrix I need to overwrite? From what I understand, the correlation between microbes is controlled by Omega. However, Omega isn’t bound between 1 and 1 so I’m guessing it’s a covariance matrix. My guess would be that Corr_star is the one I’m looking for. But what’s the difference between sigma, Sigma, Omega, and Corr_star?
Hi  thanks for the interests!

You shouldn’t need to fit the data again. Just access the
Stool
fit object, which is stored internally in SparseDOSSA. You can do this with e.g.Stool_mod < SparseDOSSA2:::Stool
. 
You can then simply modify with your own correlation matrix. The one you’re looking for is
Sigma
(and correspondingly,Omega
).Sigma
is the correlation matrix andOmega
is its inverse (i.e. the precision matrix).sigma
is perfeature standard error.Corr_star
is something SparseDOSSA uses internally when estimatingSigma
andOmega
. So you’d do something like
a.Stool_mod$EM_fit$fit$Sigma < My_own_corr
b.Stool_mod$EM_fit$fit$Omega < solve(My_own_corr)

You can then simulate data with
SparseDOSSA2(template = Stool_mod, new_features = FALSE, ...)
. Thenew_features = FALSE
setting is important to ensure the new correlation matrix is propagated.
Thank you so much for the quick and clear response! That makes a lot of sense now.
As for #3, is there a way to specify the number of features without specifying new_features = TRUE? It seems the default is 332 if new_features = FALSE but I’d like 1,000 features instead.
This can be done but needs to call SparseDOSSA’s internal functions. To generate “new” features that “look like” the true 332 stool features, SparseDOSSA fits a threedimensional density estimator on the pi0, mu, and sigma values of each feature. This fit is stored internally in the F_fit
component.
To use this for your case, I’d simulate 1,000 features, and use their pi0, mu, and sigma to replace those currently stored in Stool
, just as the replacement for Sigma
and Omega
above.
 First, simulate 1,000 new futures:
Stool <
Stool_mod <
SparseDOSSA2:::Stool
feature_param_template < cbind("pi0" = Stool$EM_fit$fit$pi0,
"mu" = Stool$EM_fit$fit$mu,
"sigma" = Stool$EM_fit$fit$sigma)
feature_param <
SparseDOSSA2:::generate_featureParam(
F_fit = Stool$F_fit,
param_original = feature_param_template,
n_feature = 1000)
 Use the simulated
pi0
,mu
,sigma
to replace those stored inStool_mod
:
Stool_mod$EM_fit$fit$pi0 < feature_param[, "pi0"]
Stool_mod$EM_fit$fit$mu < feature_param[, "mu"]
Stool_mod$EM_fit$fit$sigma < feature_param[, "sigma"]
Stool_mod$EM_fit$fit$Sigma < My_own_corr # note the dimensionality must agree here
Stool_mod$EM_fit$fit$Omega < solve(My_own_corr)
Stool_mod
can then be used for simulation just as above. I’ll consider adding an exported function for this in the next update.
Thank you! I was able to get it running very easily. Much appreciated!