I want to use this function of metaphlan3.0, customizing the database. It seems that there is no fungus I need in the database. I added the marker sequence stored in a file new_marker.fasta to the marker. And the new marker was extracted from Microsporidia genome. I executed the following command, but got an error:
import pickle
import bz2
db = pickle.load(bz2.open(‘metaphlan_databases/mpa_v30_CHOCOPhlAn_201901.pkl’, ‘r’))
db[‘taxonomy’][‘k__Fungi|p__Fungi_incertae_sedis|c__Microsporidia|o__Apansporoblastina|f__Unikaryonidae|g__Encephalitozoon|s__Encephalitozoon_intestinalis’]=(||||||2591,2216898)
File “”, line 1
db[‘taxonomy’][‘k__Fungi|p__Fungi_incertae_sedis|c__Microsporidia|o__Apansporoblastina|f__Unikaryonidae|g__Encephalitozoon|s__Encephalitozoon_intestinalis|t__GCA_000146465.1’]=(’||||||2591’,2216898)
^
SyntaxError: invalid syntax
Hi fbeghini, I have another question about choosing marker genes. My research focuses on fungi. How should I choose such marker genes? It seems that core genes need to be extracted, but how does Metaphlan distinguish different species at the same genus level? I tried to use the small subunit ribosomal RNA genes, and the labeled species seemed to be able to be classified. But such a direct approach makes me doubt the accuracy of the results. Is it reasonable to use small subunit ribosomal RNA genes directly?
Any gene that meets the requisites (coreness and uniqueness in the species) can be a marker gene. 18S is not advised to be used since multiple copies can be present and it is conserved among different species. Internally, MetaPhlAn distinguish between different species by looking at which clade the marker is assigned to ([‘clade’] and [‘taxonomy’] entry in the pkl db).