I was running the calculate_unifrac.R script (MetaPhlAn/calculate_unifrac.R at master · biobakery/MetaPhlAn · GitHub) and observed it ignored the first of the samples from the input merged MetaPhlAn file.
After reviewing the code it seems the problem comes from this line (line 44):
mpa_table <- mpa_table[grep('s__',mpa_table[,1]),-2]
after substituting this line with this:
mpa_table <- mpa_table[grep('s__',mpa_table[,1]),]
the problem was solved. Just for you guys to know.
Best regards
That’s weird, the first 2 columns of the merge_metaphlan_tables.py
output are the taxonomy string and the NCBI taxonomy ID of each clade, so -2
should discard both these columns.
Are you using the last version of merge_metaphlan_tables.py
?