Some questions about Humann3 output

First of all, I reveal that I am a beginner in bioinformatics.

I have some questions.
First of all, why is it a gene family, not a gene?
Secondly, what is the difference between “unclassified” and not?
Lastly, I ran the rename, I got a value like “UniRef90_A0A011NN47: transcriptional_regulation_protein_RstA”, so can’t I map the gene name like “RstA”?

Thank you in advance.

Good questions!

  1. “Gene” typically refers to a specific nucleotide sequence in a particular organism. Gene family encompasses a group of sequences of very similar sequences across organisms that likely perform the same (or highly similar) functions.

  2. “unclassified” means that the reads were mapped at the translated search stage, and so while we are making a guess about the gene family they derive from, we are not assigning a putative source species / taxonomy.

  3. For gene family names, we just carry forward whatever is assigned in UniProt to the family’s representative sequence (e.g. “XYZ” in “UniRef90_XYZ”). As in this case, these may be more verbose than necessary for casual presentation.

