Output as counts not proportions

Which parameter should I specify if I would like to have the output up until normalization for library size? I would like to get these counts and not relative abundances but including all of the necessary normalization (e.g., normalizes the total number of reads in each clade by the nucleotide length of its markers). I plan to use compositional data analysis and perform log-ratio transformation on the output. My method requires counts and not proportions.

Would that be the column “coverage” from the output with -t rel_ab_w_read_stats ?

MetaPhlAn can provide normalized RPK with -t marker_counts --nreads <<metagenome size>> but this does not include the normalization per genome size since the output are markers and not species (processed after the marker presence).

Column ‘coverage’ from rel_ab_w_read_stats is the estimated coverage of the species present in the metagenome

How can I get an output as species count and not proportion? I think that is the sum of local abundance before divided by the sum of all the local abundances.

You can not. From how the program is now written, you can get the count per each marker or use the estimated reads counts from the estimated_number_of_reads_from_the_clade column from rel_ab_w_read_stats

1 Like