PCA loadings in publication

Hi, sorry for asking a very basic question about one of the figures in your paper (A bubucker S, Segata N, Goll J, Schubert AM, Izard J, et al. (2012) Metabolic Reconstruction for Metagenomic Data and Its Application to the Human Microbiome. PLoS Comput Biol 8(6): e1002358. doi:10.1371/journal.pcbi.1002358 ). The PCA of niche specific modules (Figure 4) is slightly confusing. It looks like the principal loadings were generated for niches (individuals). Loadings as I understand them are for features (modules). Am I missing something? Can you share the input data and the code that generated this graph?


Apologies it took us so long to get back to you on this one! There aren’t very many people left in the lab from the original study, and it took a while for the question to make its way through to me.

In general, PCA (or any other ordination) loading can be generated for either features (rows) or samples (columns) in a matrix by transposing it before analysis. So while many ordinations focus on groups of similar samples, this figure looks at the transpose, i.e. groups of similar features (metabolic modules). They’re shown averaged by niche (body site), since that’s the main driver of variation across the body-wide microbiome.

If you want to dig in a bit more, the code is quite old and not fantastically organized, but still available in the pca*.R files of:

Many thanks -

Thankyou so much! That answers my question and thankyou for sharing the R scripts.