Identical values of different RXN number (with the same gene family)

Hi Biobakery staff,

The output file (genefamilies) of humann (v3.7) have identical values of different RXN numbers, which were assigned to the same gene family, as shown in here:

Shall I delete the replicas?
Could you give some suggestions?

Can you say more about how you created this file? I agree that it looks odd with all the identical very tiny values.

Hi Franzosa,

Thank you for your reply.
I followed the tutorial to get this file and the stepts were as shown here:

  1. humann_join_tables
  2. humann_regroup_table --input *_genefamilies_relab.tsv --output *_genef amilies_relab_rxn.tsv --groups uniref90_rxn
  3. humann_rename_table --input genefamilies_relab_rxn.tsv --output named_genefamilies_relab_rxn.tsv --names metacyc-rxn
  4. humann_split_stratified_table

My labmate also has the same issue.
Looking forward to hearing from you.

Best wishes,
Xia

A few thoughts. 1) There might be a loss of precision happening somewhere. I tend to work with gene families in CPMs rather than relative abundance since the numbers are bigger. At one point I think there was a bug in one of the scripts that truncated at a small number of decimals by default, and so if genes were expressed in units of relative abundance they might be forced to the same very small values. 2) It looks like many of those RXNs legitimately map to the same EC number. That probably means they are associated to the same sets of sequences, in which case their abundances would all be a function of the total abundance of those sequences.

Hi Xia!

Did you end up merging the counts together for this? I am running into the same issue. I have many different RXN #'s that share the same EC number.

I wasn’t sure if I need to rerun everything from the beginning to regenerate the data and see if the issue happens again.

Thanks!
Emily

Hi Emily,

Thank you for asking.
No, I did not merge the counts. I used the data of KEGG/EC/GO annotation which had no such issue.

Xia

Hi Xia,
Thanks for your answer! I’m a bit new to this, so I’m sorry if this is a silly question.

When you say that you used the KEGG/EC/GO annotation, does that mean you used a different program? Or only used the gene annotations that did not have a RXN number?

Thank you!
Emily

Hi Emily,

Sorry for the late response. I used humann_regroup_table function for the annotation. You will find more details on the website GitHub - biobakery/humann: HUMAnN is the next generation of HUMAnN 1.0 (HMP Unified Metabolic Analysis Network).

Hope this helps.

1 Like