Hi there,
I am running HUMAnN on a number of fastq files that are part of a public database. Each sample has R1, R2 and R3. For most of my samples, I would concatenate R1, R2 and R3 to generate R4 and run HUMAnN on R4. For some of them, however, R1 and R2 are too big and HUMAnN takes a long time to run (my HPC permissions for durations of jobs do not extend for that long unfortunately). I was wondering if the output would make sense if I split R1 and R2 into two equal halves e.g. R1A and R1B + R2A and R2B and then ran HUMAnN on all of them individually and then combining the output files? I imagine because the output is in RPKs combining genefamilies should not be much of an issue but I am more curious with the other two.
Thanks for your help in advance!