Analysis of longitudinal data

huy · February 26, 2026, 1:35pm

Hi all,

I’m having a longitudinal HUMAnN3 result and my bioinformatics skills were not expert enough to deal with this yet, therefore I really need some insights and guidance.

About the experiment, let’s say there are 2 group, group C and N. In each group, there were 7 mice (the mice were different in each group), the stool samples from the mice were collected within 24hrs at 7 different time points, let’s say from T1 to T7. Thus I would have 98 HUMAnN results (7 mice x 7 timepoints in each group C and N).

First, I want to check within a group C or N, is there any pathway differential from one to another? For example, I would put Timepoint T1 as reference, and I assume by doing this it will compare pathway A at T1 to T2, T3,… T7?

The input_data is the cpm unstratified HUMAnN result from group N only. fixed_effects, reference, random_effects were set as in the code but I am not sure if it’s correct for my current research question or not.

fit_data <- maaslin3(
input_data      = features_N,
input_metadata  = meta_short,
output          = “group_N”,
fixed_effects   = c(“Timepoint”),
reference       = “Timepoint,T1”,
random_effects = c(“MouseID”),
normalization   = “NONE”,     # Since I already did the humann_renorm_table?
transform       = “LOG”,
max_significance = 0.05,    
plot_associations = TRUE      
)

Is there anything I should change in my current code. Is there any other tools/packages you can recommend other than MaAsLin3 if this one will not work with my current data (but I believe it should work).

Many thanks in advance,
Huy

WillNickols · February 26, 2026, 2:02pm

Yes - this all looks correct for your question. This will give you each time point compared against the baseline. Since you’re using an unstratified table, your normalized values should sum to 1 (proportions of a whole) within each sample. If this is already the case, setting normalization to none should be fine.

Will

huy · February 27, 2026, 3:36pm

Hi Will,

Thank you for your answer, I hope you don’t mind looking at some of my results to make sure I have the correct interpretation, and also there is something missing in the result I think.

In the heatmap and dot plot, I can see that the prevalence value is missing in many pathways, what could be the explanation of this?

summary_plot1920×1019 259 KB
In the dot plot, are the vertical lines (abundance and prevalence) the “base lines”? For example, in dot plot Timepoint 20, if a circle or a triangle falls on the left side in one pathway, means that that certain pathway has a lower abundance/prevalence in Timepoint 20 compared to the “reference” in the code. Together with the color (dark violet/dark green) means it’s lower and it’s significant.
I have a hard time understanding the heatmap, what does the beta coef value tell me in this case? Does it, for example, at time point 24, the color at that time point in a certain pathway is blue, mean that that pathway has a lower abundance/prevalence at time point 24 compared to the reference group?
Running the upper code gives me a folder with the path is association_plots/Timepoint/linear and there are 5 png pictures, the pictures is an empty plot however, which could be the explanation for this?

Timepoint_CALVIN.PWY..Calvin.Benson.Bassham.cycle_linear1920×2232 89.5 KB

So far that’s all my question regarding the result. I hope you can help me clear things up.

Best, Huy

WillNickols · February 27, 2026, 5:08pm

Checking the full results file would probably be more useful, but likely the issue is that these pathways are never absent, so there’s no sense in fitting a presence/absence model to them.
The vertical lines are the null hypothesis (top of the legend), which is the median of the coefficients. This is the value the coefficients are compared against to determine significance.
The coefficients are the same as in the output table and show, in your case, the relative increase or decrease in abundance/prevalence relative to whichever time point you set as the baseline.
I’m not sure what the issue is with the pngs just from looking at it. If you want to email me a chunk of the data and code that can reproduce this at willnickols@g.harvard.edu I can check what’s going on.

Will

Topic		Replies	Views
Longitudinal data analysis without baseline MaAsLin	4	748	February 14, 2022
HUMANN 3.0 function downstream analysis HUMAnN	3	1761	January 16, 2023
Longitudinal analysis using Maaslin3? MaAsLin	5	504	March 19, 2025
Compatibility of Humann2 Output Files for Maaslin2 MaAsLin	11	3301	May 28, 2021
Differential pathway abundance analysis by Masslin3 MaAsLin	1	297	April 10, 2025

Analysis of longitudinal data

Related topics