I’m trying biobakery (latest docker image).
I would like to understand how the read count information in output/kneaddata/merged/kneaddata_read_count_table.tsv and in output/humann/counts/humann_read_and_species_count_table.tsv compare to each other.
Example from one of my sample:
|decontaminated hg37dec_v0.1 pair1|228858|
|decontaminated hg37dec_v0.1 pair2|228858|
|decontaminated hg37dec_v0.1 orphan1|7064|
|decontaminated hg37dec_v0.1 orphan2|8104|
and in humann_read_and_species_count_table.tsv:
If I do final pair1 + final pair2 + final orphan1 + final orphan2, I get 472884, which is different (higher) than what humann reports. If I exclude orphans, I only get 457716, which is still different (lower) than what humann reports.
Any hint of where this could stem from?