Unfortunately, after looking at the HMP2 metadata file I still could not figure out how the raw file names are mapped to the entries in the metadata table file. For example there these raw files:
"161014_pool0802C-7.raw 161014_SM-6EFPH_534.raw 161014_SM-7I9I9_611.raw 161014_SM-7MCW3_571.raw 161014_SM-9JGD6_552.raw 161014_SM-ARGGH_608.raw 161014_SM-AZAHS_584.raw 161014_SM-CHS7O_541.raw
161014_pool0802C-8.raw 161014_SM-77FYM_616.raw 161014_SM-7L41Y_614.raw 161014_SM-7PAR7_568.raw 161014_SM-9WOHR_572.raw 161014_SM-AVAFD_585.raw 161014_SM-BYMCQ_589.raw
161014_SM-6CAJG_529.raw 161014_SM-7CS3Y_593.raw 161014_SM-7M8TF_602.raw 161014_SM-7T2LO_557.raw 161014_SM-A77X7_538.raw 161014_SM-AXQRR_588.raw 161014_SM-C1MZD_545.raw "
and then in the metadata file the only column that I find that is somewhat related is the “PDO Number” column where I can find raws with PDO Number values as being 161014, but I still can not figure out a way to relate these individual raw files to entries in the metadata table. For reference I am attaching the metadata subtable that I extracted that is just concerned with the proteomics datasets.
Furthermore in the paper its mentioned that 447 stool samples are sequenced for metaproteomics analysis, and in the metadata table t here are 451 entries whereeas in the FTP site I downloaded 641 raw files. I am really confused by this. I’m guessing some of the samples are fractionated and therefore they need to be combined later for analysis. Could you also tell me which files (file names) are fractionated and are coming from the same sample?