Pangenome generation

I cannot find a guide to pangenome generation for PanPhlAn 3. I’m especially interested in using Roary. Is there a resource available?




indeed this functionality (custom pangenome) was available in PanPhlAn 1.3 using USEARCH clustering. However it now presents some disadvantages since it rely on an rather old version of USEARCH and on top of that, cluster sequences in label-free clusters. By that, I mean that clusters will be named “Gene family 1” or “gene family 2” while using the pangenome database gives you UniRef90 ID and a mapping file of these IDs to other databases such as GO, KEGG, Pfam, eggNOG…

We aim as a medium term project to add some functionalities for adding some custom user-provided genomes to the PanPhlAn-provided pangenome’s database by clustering each coding sequence with the known ones and assign new ones if needed.

OK, thanks. I look forward to more details!

Best wishes,


Hello Leonard,

Any update on this? I’ll definitely be seeking to use custom pangenomes, and it would be ideal to be able to apply the up-to-date panphlan.

Many thanks,


Hello Andrew,

we’ve been working on this part of the code but re-organized things a bit because the software covers more than PanPhlAn pangenome generation. Atm the code is ready (but the repository still private). We still need a bit of time to write a minimal documentation. Thanks for reminding us that there is the demand for this outside of our lab :grin: We’ll try to make things available next week. I’ll keep you updated.


the repo is ready :
This is an independent tool that requires more dependencies than PanPhlAn and can be quite long to run is you have a lot of genomes, however it’s worth it :grin: the results are really nice.

With all dependencies installed correctly and the databases downloaded (rather big, be patient in the download), this should do the work.

Let me know if you encounter troubles while running it, or if you have questions on the overall workflow.

Best wishes and have fun with the tool

That’s great, thanks for letting us know. I look forward to giving it a go!

Best wishes,


Hi, I tried the GitHub - SegataLab/PanPhlAn_pangenome_exporter.
However, the failed.
And I tried to directly download the link from it and it returned 404 error. (Attached figures)

Without the database files, how can I generate database by myself? I found in Humann3 there are similar diamond indexes, but UniRef90to50_201906.tsv.bz2 was not found anywhere.



Hi, should be fixed. Let me know if something’s still wrong