Running calculate_unifrac.R ignores first sample

david-castillo · January 22, 2021, 11:25pm

I was running the calculate_unifrac.R script (MetaPhlAn/calculate_unifrac.R at master · biobakery/MetaPhlAn · GitHub) and observed it ignored the first of the samples from the input merged MetaPhlAn file.
After reviewing the code it seems the problem comes from this line (line 44):
mpa_table <- mpa_table[grep('s__',mpa_table[,1]),-2]
after substituting this line with this:
mpa_table <- mpa_table[grep('s__',mpa_table[,1]),]
the problem was solved. Just for you guys to know.
Best regards

fbeghini · January 29, 2021, 4:14pm

That’s weird, the first 2 columns of the merge_metaphlan_tables.py output are the taxonomy string and the NCBI taxonomy ID of each clade, so -2 should discard both these columns.

Are you using the last version of merge_metaphlan_tables.py?

Topic		Replies	Views
calculate_diversity.R Issues MetaPhlAn	3	58	December 16, 2024
Metaphlan4 Diversity Calculation Error MetaPhlAn	2	506	April 21, 2023
Questions about calculate_diversity.R script MetaPhlAn	0	116	June 12, 2024
Issues with Unifrac calculations MetaPhlAn	4	348	February 9, 2024
About metaphlan3 201901 phylogenetic tree MetaPhlAn	0	506	April 2, 2022

Running calculate_unifrac.R ignores first sample

Related topics