HUMAnN Is Ignoring Viruses

I can chime in on this one - there are a few things going on:

  1. We’re in the process of re-evaluating methods for viral profiling given that viruses don’t conform to the same principles that MetaPhlAn uses for profiling cellular microbes (e.g. averaging signals from 100s of marker genes). An approximate method (inherited from MetaPhlAn 2) was retained in MetaPhlAn 3 as an “expert mode.”

  2. Partly as a consequence of 1, and partly due to the general “weirdness” of defining a pangenome for viral species (which don’t have quite the same flexible “bag of genes” biology compared with cellular microbes), we opted not to include approximated pangenomes for them in HUMAnN 3.

There are a couple of workarounds:

  1. If MetaPhlAn 3 detects a virus, you can treat the entire viral genome (including all of its proteins) as having being detected.

  2. Since HUMAnN 3 will map viral reads to proteins during the translated search phase, you could use the infer_taxonomy script to work out which unclassified UniRef90s are likely viral in origin.

Hope this helps!