Hi, it looks like the humann4 command to create the genefamilies tables has a glitch that occasionally appends a gene name to a value and then adds a duplicate line, like the following. (See “0.04051809530_A0A0U0N762” in the penultimate line. Is this a known problem? I found it in this one genefamilies file but not in others.
The command I ran was this:
humann --input singlesamples/magic-2712/magic-2712_concatenated.fastq --output humann_tmp --output-basename magic-2712 --threads 16 --taxonomic-profile metaphlan_tmp/magic-2712.txt --metaphlan-options “-t rel_ab_w_read_stats --index mpa_vOct22_CHOCOPhlAnSGB_202403”
See the tail of the output genefamilies file:
tail -n 5 magic-2712_2_genefamilies.tsv
UniRef90_A0A2X1KUD0 0.0509370340
UniRef90_A0A2X1KUD0|unclassified 0.0509370340
UniRef90_A0A0U0N762 0.0405180953
UniRef90_A0A0U0N762|unclassified 0.04051809530_A0A0U0N762 0.0405180749
UniRef90_A0A0U0N762|unclassified 0.0405180749