I am trying to run humann3 (v. 3.7) on acollection of 537 good quality (>50% complete, < 5%redundant) MAGs assembled from human gut samples. The MAGs were assembled by metabat3. However, the results look very odd. 84% of the MAGs (451) return completely empty results:
Pathway 171007M.6_Abundance
UNMAPPED 0.0000000000
UNINTEGRATED 0.0000000000
and the few that return some results at all (86) have very few lines:
Pathway 171003M.19_Abundance
UNMAPPED 0.0000000000
UNINTEGRATED 38.2009166740
UNINTEGRATED|unclassified 38.2009166740
PWY-7238: sucrose biosynthesis II 0.7363770250
PWY-7238: sucrose biosynthesis II|unclassified 0.7363770250
The command I am running is for each MAG file (fasta):
humann --input MAG_name.fa --output OUT_DIR
The slurm output file has a lot of warnings of this sort:
Total species selected from prescreen: 0
Selected species explain 0.00% of predicted community composition
No species were selected from the prescreen.
Because of this the custom ChocoPhlAn database is empty.
This will result in zero species-specific gene families and pathways.
I believe I have downloaded and am using the full chocophlan and uniref databases.
What am I doing wrong here? Does humann have to be run at the read level?