Taxonomy for markers (-t marker_counts)

c.e.chong · August 21, 2020, 1:51pm

Hi,

I have run MetaPhlAn 3 with -t marker_counts as I wanted to output read counts for each marker to do further statistical analysis such as phyloseq and deseq2.

This has given me an output of what I assume is each marker and the read count. Is there anyway to get the taxonomy of each marker?

I am also getting an error when trying to merge the tables together. Does merge_metaphlan_tables.py only work with the rel_ab output?

 Traceback (most recent call last):
  File "/pub38/cchong/miniconda2/envs/metaphlan3/bin/merge_metaphlan_tables.py", line 10, in <module>
    sys.exit(main())
  File "/pub38/cchong/miniconda2/envs/metaphlan3/lib/python3.7/site-packages/metaphlan/utils/merge_metaphlan_tables.py", line 78, in main
    merge(args.aistms, sys.stdout)
  File "/pub38/cchong/miniconda2/envs/metaphlan3/lib/python3.7/site-packages/metaphlan/utils/merge_metaphlan_tables.py", line 46, in merge
    names = names,
UnboundLocalError: local variable 'names' referenced before assignment

Thanks in advance!

fbeghini · August 24, 2020, 7:33am

Directly no, but you can use the mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2 file in order to get the 'taxonomy' field associated to each marker.

Yes, only with rel_ab and rel_ab_w_read_stats

david-castillo · November 11, 2020, 10:01am

Hello Charlotte and Francesco,
I came across a similar problem: trying to translate markers when running metaphlan with the -t marker_counts option.
As suggested by Francesco I used the mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2 for this. I created a small script and managed to obtain the format I wanted, these are the first lines (the second field is the read count (adding all the read count markers that contributed to a bug)):

k__Bacteria|p__Firmicutes|c__Clostridia|o__Clostridiales|f__Lachnospiraceae|g__Lachnospiraceae_unclassified|s__Eubacterium_rectale	58749
k__Bacteria|p__Proteobacteria|c__Betaproteobacteria|o__Burkholderiales|f__Burkholderiaceae|g__Cupriavidus|s__Cupriavidus_metallidurans	1
k__Bacteria|p__Synergistetes|c__Synergistia|o__Synergistales|f__Synergistaceae|g__Pyramidobacter|s__Pyramidobacter_piscolens	78
k__Bacteria|p__Verrucomicrobia|c__Verrucomicrobiae|o__Verrucomicrobiales|f__Akkermansiaceae|g__Akkermansia|s__Akkermansia_muciniphila	2712
k__Bacteria|p__Fusobacteria|c__Fusobacteriia|o__Fusobacteriales|f__Fusobacteriaceae|g__Fusobacterium|s__Fusobacterium_equinum	4
k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Xanthomonadales|f__Xanthomonadaceae|g__Lysobacter|s__Lysobacter_enzymogenes	1
k__Bacteria|p__Fusobacteria|c__Fusobacteriia|o__Fusobacteriales|f__Leptotrichiaceae|g__Leptotrichia|s__Leptotrichia_sp_oral_taxon_225	1
k__Bacteria|p__Proteobacteria|c__Deltaproteobacteria|o__Desulfovibrionales|f__Desulfovibrionaceae|g__Bilophila|s__Bilophila_wadsworthia	1111
k__Bacteria|p__Proteobacteria|c__Deltaproteobacteria|o__Desulfovibrionales|f__Desulfovibrionaceae|g__Desulfovibrionaceae_unclassified|s__Desulfovibrionaceae_bacterium	3

Same thing as what Charlotte mentions, I want to use this table for further analysis, I’m thinking DESeq2 to perform a differential abundance analysis. However, I’m concerned I’m not taking into consideration thresholds such as percentage of markers with zero counts to discard an specie, among others.
Therefore, my question is: Is it valid to use the read counts table with the taxonomy info for differential abundance with DESeq2? or I’m beeing too naive?
Regards,
David

fbeghini · November 20, 2020, 11:08am

Yes! After transformation you should consider the different read depth by adding an offset term in your model.

There’s also an example of the count conversion in the curatedMetagenomicData vignette here

Emiliana_Romagnoli · October 6, 2021, 7:06pm

Hello David,

Can you share your script?

Emiliana

johannesbjork · October 8, 2021, 11:41am

But this is assuming that the read depth per sample is known. In many cases, this is not the case. There are several threads where people are asking about how to output counts rather than the normalised relative abundances, but none provide a clear response. Relative abundances are suboptimal as information is lost in the normalisation, and it is also not suitable for common analyses such as Bray-Curtis.

Topic		Replies	Views
-t marker_counts output with taxonomy results MetaPhlAn	3	487	September 28, 2020
Obtain taxonomy for marker_ab_table MetaPhlAn	0	288	January 15, 2022
Metaphlan marker level analysis MetaPhlAn	6	908	July 22, 2020
Metaphlan with -t marker counts, not producing counts MetaPhlAn	2	153	March 26, 2024
Merge_metaphlan_tables.py with rel_ab_w_read_stats MetaPhlAn	3	1380	December 7, 2022

Taxonomy for markers (-t marker_counts)

Related topics