Identical values of different RXN number (with the same gene family)

Hi Biobakery staff,

The output file (genefamilies) of humann (v3.7) have identical values of different RXN numbers, which were assigned to the same gene family, as shown in here:

Shall I delete the replicas?
Could you give some suggestions?

Can you say more about how you created this file? I agree that it looks odd with all the identical very tiny values.

Hi Franzosa,

Thank you for your reply.
I followed the tutorial to get this file and the stepts were as shown here:

  1. humann_join_tables
  2. humann_regroup_table --input *_genefamilies_relab.tsv --output *_genef amilies_relab_rxn.tsv --groups uniref90_rxn
  3. humann_rename_table --input genefamilies_relab_rxn.tsv --output named_genefamilies_relab_rxn.tsv --names metacyc-rxn
  4. humann_split_stratified_table

My labmate also has the same issue.
Looking forward to hearing from you.

Best wishes,
Xia

A few thoughts. 1) There might be a loss of precision happening somewhere. I tend to work with gene families in CPMs rather than relative abundance since the numbers are bigger. At one point I think there was a bug in one of the scripts that truncated at a small number of decimals by default, and so if genes were expressed in units of relative abundance they might be forced to the same very small values. 2) It looks like many of those RXNs legitimately map to the same EC number. That probably means they are associated to the same sets of sequences, in which case their abundances would all be a function of the total abundance of those sequences.