Pangenome generation

AndrewM · July 13, 2020, 1:49pm

I cannot find a guide to pangenome generation for PanPhlAn 3. I’m especially interested in using Roary. Is there a resource available?

Thanks,

Andrew

leonard.dubois · July 13, 2020, 2:58pm

Hello,

indeed this functionality (custom pangenome) was available in PanPhlAn 1.3 using USEARCH clustering. However it now presents some disadvantages since it rely on an rather old version of USEARCH and on top of that, cluster sequences in label-free clusters. By that, I mean that clusters will be named “Gene family 1” or “gene family 2” while using the pangenome database gives you UniRef90 ID and a mapping file of these IDs to other databases such as GO, KEGG, Pfam, eggNOG…

We aim as a medium term project to add some functionalities for adding some custom user-provided genomes to the PanPhlAn-provided pangenome’s database by clustering each coding sequence with the known ones and assign new ones if needed.

AndrewM · July 15, 2020, 8:37am

OK, thanks. I look forward to more details!

Best wishes,

Andrew

AndrewM · October 2, 2020, 11:24am

Hello Leonard,

Any update on this? I’ll definitely be seeking to use custom pangenomes, and it would be ideal to be able to apply the up-to-date panphlan.

Many thanks,

Andrew

leonard.dubois · October 2, 2020, 2:18pm

Hello Andrew,

we’ve been working on this part of the code but re-organized things a bit because the software covers more than PanPhlAn pangenome generation. Atm the code is ready (but the repository still private). We still need a bit of time to write a minimal documentation. Thanks for reminding us that there is the demand for this outside of our lab We’ll try to make things available next week. I’ll keep you updated.

leonard.dubois · October 9, 2020, 9:35am

Hello,

the repo is ready : https://github.com/SegataLab/PanPhlAn_pangenome_exporter
This is an independent tool that requires more dependencies than PanPhlAn and can be quite long to run is you have a lot of genomes, however it’s worth it the results are really nice.

With all dependencies installed correctly and the databases downloaded (rather big, be patient in the download), this should do the work.

Let me know if you encounter troubles while running it, or if you have questions on the overall workflow.

Best wishes and have fun with the tool

AndrewM · October 10, 2020, 4:50pm

That’s great, thanks for letting us know. I look forward to giving it a go!

Best wishes,

Andrew

yuxiangtan · April 11, 2022, 9:29am

Hi, I tried the GitHub - SegataLab/PanPhlAn_pangenome_exporter.
However, the download_databases.py failed.
And I tried to directly download the link from it and it returned 404 error. (Attached figures)

Without the database files, how can I generate database by myself? I found in Humann3 there are similar diamond indexes, but UniRef90to50_201906.tsv.bz2 was not found anywhere.

Best,

Yuxiang

leonard.dubois · April 12, 2022, 8:22am

Hi, should be fixed. Let me know if something’s still wrong

Minuzzi · January 17, 2024, 4:29pm

Hello Leonard,
thank you for this script. I am trying to use it following the instruction of the github page but I’m getting the following error message

Wed Jan 17 17:19:12 2024 Writing PanPhlAn tsv...Traceback (most recent call last):
  File "PanPhlAn_pangenome_exporter/panphlan_exporter.py", line 521, in <module>
    panphlan_exporter(args.input, args.tmp, args.output, args.clade_name, args.nprocs, args.db_path)
  File "PanPhlAn_pangenome_exporter/panphlan_exporter.py", line 502, in panphlan_exporter
    write_panphlan_tsv(inputdir, tmp_dir, ppa_outdir, clade_name, contigs_names_dict, contigs_names_dict_prokka, extend
_pangenome)
TypeError: __init__() got an unexpected keyword argument 'strand'

How can I fix that? thank you

Topic		Replies	Views
Panphlan_pangenome_generation.py PanPhlAn	1	901	April 24, 2020
PanPhlAn Identifier to Uniprot PanPhlAn	2	417	July 12, 2021
Panphlan reference input PanPhlAn	1	498	November 8, 2021
Help with PanPhlAn tutorial PanPhlAn	6	1423	May 11, 2020
Input for Panphlan PanPhlAn	2	419	July 22, 2021

Pangenome generation

Related topics