Can MaAsLin2 be applied on gene coverage data?

Hello everyone,

I have 18 soil metagenomes sampled from three sites (6 samples each) and soil from two layers (top and bottom) in each site, for which I have assembled. I’d like to do a comparative study of the gene abundance in these samples. What I did was that I first estimated the presence of KOs using KEGG’s kofam database on the contigs, and I also got the coverage of each KO in my samples. Next, I calculated the total coverage of each KO (i.e., by summing the coverage of the same KOs on different contigs) in every individual sample. I am wondering if I can use coverage values as input for MaAsLin2, as opposed to the number of reads mapped on each gene? Can I use site as a fixed and layer as a random variable if I were to find site-wise differences?

An example of my input data would be

sample A1-top A2-top A3-top A1-bot A2-bot A3-bot B1-top B2-top B3-top B1-bot B2-bot B3-bot C1-top C2-top C3-top C1-bot C2-bot C3-bot
K00001 36.3978 0 22.4602 74.0771 37.8024 27.4386 23.3456 41.8246 25.8649 51.336 27.8538 22.8763 26.036 49.5105 24.3947 34.8723 52.0351 33.0193
K00003 747.666 828.64 509.787 950.148 993.473 777.476 717.928 732.745 633.716 894.339 821.97 874.446 850.249 1226.15 1222.42 1011.75 1087.76 1050.81
K00004 10.3978 7.96612 0 9.25023 19.4712 7.21569 5.16952 11.9648 11.1345 29.6271 34.3835 23.7866 26.3798 34.4165 15.7022 29.2394 22.2378 16.9822
K00005 48.0351 66.329 14.0175 70.9348 12.1143 99.1202 46.5701 91.1882 61.639 65.1754 93.5285 135.657 290.069 315.314 279.189 211.366 318.911 200.309
K00009 8.69086 0 9.73114 0 19.1342 8.89908 0 0 0 0 0 0 15.148 27.9347 18.6585 22.022 26.684 31.8226
K00010 126.77 119.827 172.367 203.431 297.6 155.051 40.8878 98.7166 24.2119 63.5848 141.801 83.471 74.2136 144.451 76.1991 36.0218 72.0287 61.0815
K00012 1530.33 1601.31 1312.18 2208.2 2488.36 1669.64 992.256 1011.06 924.257 1240.32 1182.85 915.699 719.04 1180.61 1060.9 695.96 1201.49 1044.33
K00013 256.634 363.695 150.823 575.562 442.416 416.417 312.265 409.185 363.236 512.297 424.308 323.398 219.852 616.609 573.806 436.281 548.883 483.997

Thank you!

Rui

Hi @ruizzhan - based on your description, I think you can use MaAsLin 2 for your purpose. Please make sure you turn the normalization off if you have already normalized the data.

Best,
Himel