Filtering of gene family and metabolite abundance file

Hello!! I have successfully obtained the gene family abundance file which contains about 5346 features.
Is it necessary to do filtering of low abundant and low prevalent features?
Also, I have performed log transformation on the metabolite peak areas file as mentioned in the thread .
What prevalence and abundance cutoff should we use on this log-transformed metabolite table for melonnpan training?

Many thanks

Hi @paramartha_banerjee - filtering of low abundant and low prevalent features is recommended for reducing the computational burden. You can also do variance filtering for both the metabolites and gene families for the same purpose. The cutoff is subjective and problem-specific but using something like the nearZeroVar function in caret is a good start.

Hope this helps!