How to create absolute abundance tables at family, genus, class and phylum levels

Dear developer,

I am stuck at grepping the absolute abundnace of species, genus, family, class and phylum from the big merged table. Here is the detail.

1). I ran metaphlan 4.0.5 with -t rel_ab_w_read_stats option to get the absolute abundance of the reads from my 200 samples.

  1. I merged all the tables into one bigger table with merge_metaphlan_tables_abs.py script that developer provided. New big table is now called merged_abs.txt

  2. Next, I grepped only species abundance as below:
    grep -E "s__|clade" merged_abs.txt | grep -v "t__" | sed 's/^.*s__//g' > merged_all_species.txt
    That looked fine.

  3. However, when I tried to grep genus, family, class and phylum abundance information from that big table and I only replaced corresponding letter in the above script grep -E "s__|clade" . So, for genus abundance, I used grep -E "g__|clade" and kept the rest part of the script the same. Similarly, for phylum I used grep -E "p__|clade" and so on…

Although species abundance table looks fine but all other tables have mixed up taxa like I get species information even after I ran script to merge tables at only class level.

Could you please help me solving this issue? I know I can run metaphlan at lets say class level by providing 'c' flag but then it gives me absolute abundance, however, I need absolute read abundance.

Many thanks in advance!

Hi @mars
You can follow this rule:

  • For getting only the species abundance grep -E “s__|clade” merged_abst.txt | grep -v “t__” | …
  • For getting only the genus abundance grep -E “g__|clade” merged_abst.txt | grep -v “s__” | …
  • For getting only the family abundance grep -E “f__|clade” merged_abst.txt | grep -v “g__” | …
    An so on
1 Like

The above commands should work for grepping the relative abundance tables at genus, family or species level?

Exactly, that should do the work