Question about PICRUSt

From: 毛维奥 <mmmaoweiao@126.com>

Hello,I’m an user of PICRUSt. I have a question on data processing.I’ve read your paper Predictive Functional Profiling of Microbial Communities Using 16S rRNA Marker Gene Sequences ,but can’t understand the following paragraph.

‘Because 16S rRNA copy number varies greatly among different bacteria and archaea, the user’s table of OTUs is normalized by dividing the abundance of each organism by its predicted 16S copy number. The 16S rRNA copy numbers for each organism are themselves inferred as a quantitative trait by ancestral state reconstruction during the genome prediction step . Normalized OTU abundances are then multiplied by the set of gene family abundances calculated for each taxon during the gene content inference step. The final output from metagenome prediction is thus an annotated table of predicted gene family counts for each sample, where gene families can be orthologous groups or other identifiers such as KOs, COGs or Pfams. The resulting tables are directly comparable to those generated by metagenome annotation pipelines such as HUMAnN or MG-RAST.’

Would you please tell me how to standardize the output data of PICRUSt so that I can get the information of relative abundance which can be campared among different samples?

Looking forward to your reply

If the samples/columns are not automatically sum-normalized to a constant value (e.g. 1) then you can apply that operation to convert the raw abundances to relative abundances (removing any latent effects of sequencing depth).