Hi - new HUMAnN3 user here. Thank you for making such good documentation. Read through the Nat Methods and eLife papers to first get better fundamental understanding of how things work.
Very sorry if I’m not understanding correctly, but isn’t Chocophlan actually a database of panexomes, not pangenomes, since all the sequences are only CDS parts of genomes?
I’m not familiar with panexome as a term, though I see what you are getting at. The formal definition of a pangenome focuses on gene content:
That said, our methods do focus on the protein-coding portion of the pangenome (putting less focus on e.g. ribosomal and other RNA genes).
Thank you Eric, that makes sense and yes pangenome is appropriate and convention. Appreciated.