Request for guidance in tracking sequences analyzed by MetaPhlan2.0

Crist_John_Pastor · July 1, 2022, 8:52am

Hello Sir and Ma’am, I hope this email finds you well. I am currently processing metagenomic samples. Upon analyzing our metagenomic sequences generated through illumina, the BIOM file generated by MetaPhlan 2.0 detected a lot of microorganisms including one of the main targets of the research - Influenza A virus as shown below.

{“id": “8797”, “metadata”: {“taxonomy”: [“k__Viruses”, “p__Viruses_noname”, “c__Viruses_noname”, “o__Viruses_noname”, “f__Orthomyxoviridae”, “g__Influenzavirus_A”, “s__Influenza_A_virus”, “t__PRJNA15622”]}}

{“id”: “18976”, “metadata”: {“taxonomy”: [“k__Viruses”, “p__Viruses_noname”, “c__Viruses_noname”, “o__Viruses_noname”, “f__Orthomyxoviridae”, “g__Influenzavirus_A”, “s__Influenza_A_virus”, “t__PRJNA14892”]}}

As this report will greatly matter for those concerned, how could I extract the nucleotide sequences that MetaPhlan 2.0 detected from our metagenomic sequencing data? Do 8797 and 18976 correspond to the 8797th and 18976th sequence in our metagenomic sequencing? Lastly, is it possible to download the reference marker sequence used by MetaPhlan in identifying our samples?

Thank you very much!

Best regards,

Crist John M. Pastor

aitor.blancomiguez · July 18, 2022, 12:37pm

Hi @Crist_John_Pastor
For extracting the reads mapping against your target species you need to generate the SAM file while running MetaPhlAn with the option --samout. Then you need to filter out the samfile, keeping only the mapping results of those reads mapping against your species markers (you can check the id of the markers of your species here: http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v20_m200_marker_info.txt.bz2) and extract the reads from the filtered sam file using samtools. For the marker sequences, you can download the FASTA file from here: http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v20_m200.tar

Topic		Replies	Views
Non-Microbial profiling MetaPhlAn	7	404	June 15, 2022
Extraction of mapped reads MetaPhlAn	2	397	July 27, 2021
Discrepancies between marker database and BLAST MetaPhlAn	5	277	July 20, 2022
Confusing Instructions About Strain Analysis Using MetaPhlAn MetaPhlAn	2	715	September 1, 2023
MetaPhlAn2: no taxonomy annotated for some samples MetaPhlAn	3	478	September 24, 2020

Request for guidance in tracking sequences analyzed by MetaPhlan2.0

Related topics