HUMAnN and Megan comparison

Dear all,

I’d like to compare outputs from HUMAnN and Diamond/Megan. These 2 tools have 3 common annotations: EC (enzyme commissions), EggNOG and GO (Gene ontology). My idea is to rank the output (e.g. this is a k-dominance plot generated for HUMAnN EggNOG annotation:)

I would keep only a certain number of ranks with a specific cumulative abundance in outputs from both tools and then I would perform a regression analysis.

Do you have any suggestions regarding this? Do you think this is a good way how to do it?

BUT there is also one think I’m still thinking about: both HUMAnN and Megan uses a different normalization technique. HUMAnN can use either CPM or relative abundance and Megan can use only a subsampling approach to normalize to the smallest given count. Am I correct?

So, do you have some idea how to compare them? I was thinking about importing unnormalized data and perform some common normalization in R. OR is it possible to import unnormalized data from Megan to HUMAnN and perform the normalization step there?

Thank you in advance!

I can’t comment on the specifics of how MEGAN functions / how to use it optimally for comparison. I can say that of the three functional systems you listed, ECs tend to be the simplest and most stable for comparison between methods, so I would recommend working with those. For comparing quantitative profiles, you could try something like rank correlation between EC profiles from the two methods, which would hopefully capture agreement independent of normalization method (although it might overemphasize disagreement between rare ECs). For an abundance-weighted comparison, my tendency is to sum-normalize both profiles and compare them using Bray-Curtis score. I know this approach works well for HUMAnN but I’m not sure if it would be a fair comparison for MEGAN.