Question About HUMAnN Output Filtering for SGB-Level Contributions

Hi,

I have a question about how HUMAnN outputs its results.
For each EC/pathway, the output includes a row for the total (relative) abundance of the feature in the community per sample, along with rows showing the contribution of different SGBs to that feature.

However, I’ve noticed that the number of taxa contributing to each feature varies. Many taxa have a value of zero in most samples.

My question is:
Is the output filtered to include only SGBs that have at least one non-zero value across all samples in the run?
What exactly is the filtering process HUMAnN uses for reporting SGB-level contributions?

Thanks!

I’m not sure I totally understand the basis for the question? If you profiled a bunch of samples, and SGB X contributed to EC Y in sample 1 but not in sample 2, then scripts for merging the profiles will often impute a 0 for Y|X in sample 2. HUMAnN itself is only ever concerned with one sample at a time - you would only see 0s in a HUMAnN output based on a choices that were made during merging of per-sample outputs.

1 Like