Hello!! I have trained the melonnpan model using my own paired data and then I have executed the melonnpan predict workflow and am getting the following file containing the relative abundances of the predicted metabolites.
Rela_ab_metab_melonnpan_predict.csv (42.2 KB)
However, the relative abundances are the same for a particular metabolite in different samples and also some metabolites have relative abundance values of 1 and 0
what might be the reason for this result?
(I have trained my model using both TSS normalized and log transformed metabolite abundances and I am getting the same results in both cases)
Hi Paramartha – thanks for sending the input files offline.
I was able to run the training module and verified that some of the predictions are indeed constant when the corresponding model size is 1, which is expected when none of the gene families are predictive (i.e., the corresponding weights are zero). You can always discard those compounds with model size <2 (as done in the attached code). This will help you tease out valid predictions where at least one metagenomic feature is predictive.
I did not see any prediction being 0 and 1 during training. Is this something you only saw using the MelonnPan-Predict workflow?
One piece of advice - looking at your input files, I will suggest converting the complex GO terms to something simpler. This is because various functions in R convert non-standard characters in different ways and that might lead to a mismatch in names while running MelonnPan. For example, GO:0000014: [MF] single-stranded DNA endodeoxyribonuclease activity
can be simply GO:0000014
or GO_0000014
. This is just an example. The idea is to keep the feature names as simple as possible so that it remains the same throughout the process.
Would you please let me know if you see the same behavior after you re-train your model? Make sure you re-install the latest version of the software from GitHub.
Many thanks,
Himel
Test_MelonnPan.txt (2.3 KB)
Hi @paramartha_banerjee - following up on your previous message, this binary phenomenon is related to the back-transformation we need to do in case we use the AST transformation of the metabolites during training.
As a result, if AST is turned off during training, it should be turned off during prediction as well, and vice versa to prevent an unnecessary back transformation when not needed.
I just added this functionality so that this can be turned on and off using a simple flag. Please let me know if you still see the issue after you re-train your model using the updated code. Again, make sure you re-install the latest version of the software from GitHub.
Many thanks,
Himel