QUESTION: StrainPhlAn working with a genome?

Good Morning,
I hope this is the right place for posting my query.

I am trying to found core unique sequence inside of a species to be later used to design qPCR primers.
Can I use StrainPhlAn and giving as an input ~10 genome of any bacteria species (let’s say for exemple Alistipes finegoldii) as a FASTA file and later get unique species marker genes from them?
I see from the tutorial that StrainPhlAn is made to run with meta-genomic sequence but i was wondering if it can also work with a set of genome instead?

Thanks a lot for your help.

Cheers
Lorenzo

Hi Lorenzo,
If you want to get a set of unique species marker genes you can retrieve that directly using the
extract_markers.py script.
You can run it in the following way:

extract_markers.py -c *species_name* -o *output_folder*

E.g:

extract_markers.py -c s__Alistipes_finegoldii -o clade_markers

Thank you, your help was very useful, but I have another question if I could.
My version is the CHOCOPhlAn_201901 and what should I do if I didn’t found any marker for another species? There is a new update?

I see that some species are merged together like Enterobacter_cloacae_complex and I got no problem about that. But I can not find any marker for Bacillus licheniformis and Clostridium tertium, and there are plenty of annotated genome about them, do you know what is the best thing to do?

Cheers
Lorenzo

Hi Lorenzo,
The latest CHOCOPhlAn version is 201901, so you are actually using the latest one. That a species has no markers in the MetaPhlAn database could be due to different factors, like we were not able to find, at least, 10 species-unique marker, or that at the time the database was created, no annotated genome of those species was present in the UniProt Proteomes portal.
I will point @fbeghini here, he was the person in charge of building this database and probably he can help you better than me through this process. In the meanwhile, you can take a look on this post opened recently in the MetaPhlAn subforum: How to create marker sequences from a genome to add to metaphlan database?
Best,
Aitor