MGX and MTX data products

Hi,
I am interested in working with the Humann2 output from the metagenomics (MGX) and metatranscriptomics (MTX) IBDMDB data - specifically the gene families quantification.
To download a large number of these outputs, I’m using the FTP server (ftp.broadinstitute.org).
However, I’m confused about where to look for samples. Within the MGX subdirectory, one finds

2017-08-12
2017-12-14
2018-05-04

Each of these contains humann2 output files. There is a great deal of overlap between the sample names, although the contents of the files are not the same. What is the difference between these directories?
Thanks
Greg

Hi @GregP,

The following directories would be the different dates for which the sample was processed through Humann2.

2017-08-12
2017-12-14
2018-05-04

Regards,
Sagun

Hi Sagun,

So, in cases where the same sample appears in multiple folders, which do you recommend I use? For example, in the MTX data I find a sample CSM5FZ4M in both 2017-08-23/func_profiles and 2017-12-14.

Thanks
Greg

Hi @GregP ,

I would recommend using the most updated date samples /func_profiles (2017-12-14 for this case) as the software version would be the most updated one.

Regards,
Sagun

Can this really be just a question of software versions? The numbers in the output seem quite different. For example, the gene families data in

MTX/2017-08-23/func_profiles/CSM5FZ4M_humann2.tar.bz2

has numbers in the tens of thousands. while for the same genes, the file

MTX/2017-12-14/CSM5FZ4M_humann2.tar.bz2

has numbers that are far less than 1. So it seems like the normalization is different as well.

Best regards

Greg