Constructing custom database de novo

Hi Guys,

I’m currently working on human gut metagenomic data and want to make custom MetaPhlAn and HUMAnN databases based on public data such as Unified Human Gastrointestinal Genome (UHGG).

I have noticed the docs on GitHub and many valuable discussions such as customizing-chochophlan-panproteome-and-metaphlan-marker-gene-databases-with-new-taxa/2814 and github-issue-103, but most of them began with the existed databases. I was wondering whether there is any tutorial or suggestion for creating custom databases de novo.

Besides, it’s hard to keep these databases “up-to-date” with the rapid increase of bacterial genomes, and some researchers may be interested in a specific region (such as the human/mouse gut). Thus, tools for MetaPhlAn and HUMAnN to de novo construct databases would be very useful and necessary.

Best regards,
Ming

Hi @songmingl
Unfortunately, there is not a code or a tutorial available to generate a custom metaphlan database. But the method is well described in the last manuscript in case you can give it a chance to implement it: Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3 | eLife

Thanks for your reply! I will try this later.