Inputting hierarchy-split counts

Hi Biobakery folks, happy new year!

We usually use either species or genus level counts as input for Maaslin.

taxonomy  sample1  sample2
...
familyA;genusB     10              15
familyA;genusC    5                 1

Do you have any recommendations on the validity and feasibility of inputing the hierarchy-split counts, similar to how the data is formatted for Lefse?

taxonomy  sample1  sample2
...
familyA          15         16
familyA;genusB     10              15
familyA;genusC    5                 1

People in our lab want to model structure flexibility of Maaslin but crave the pretty enrichment cladograms from lefse. How would multiple corrections need to be changed? Is it even possible, given the correlated nature of the hierarchy?

Thanks in advance!

1 Like

I don’t think that there’s any straightforward way to do this with Maaslin. I don’t think any single transformation of the significance values could account for multiplicity of genera within families, phylogenetic structure, etc.

The first thing I’d try for something like this is setting up a PGLMM with internal nodes (anpan only uses terminal nodes) but implementing that might be difficult. I’ve also heard of a method called treeSeg that might work for this but I’ve never tried it. If you settle on anything, let us know!

Hi @nickp60,

Just to add to what Andrew said. For a more qualitative look at the data (e.g. to make a pretty cladogram), I have run MaAsLin at the different levels and used that data to annotate the higher-order levels. I do BH correct across the entire set post-hoc, but otherwise, leave them independent. With the caveat, if a genus (etc.) has only one species in the dataset - I’ll filter those out since it is only replicating what we observed at the species-level.

Again, this was mainly done for visualization - as Andrew notes the testing of the hierarchical data is a bit more complicated than I present in these models.

I hope this helps,
Kelsey

Another option might be to look at the structSSI paper from Susan Holmes. It looks like the associated R package has been removed from CRAN but tutorials using older versions of the package seem to be available.

Just my two cents,
Himel

1 Like

Thanks @andrewGhazi @Kelsey_Thompson @himel.mallick for your inputs! This is quite helpful, I will give this a go and see what we can come up with.