The bioBakery help forum

Fixed effects no association plot in maaslin2

Hi,

I am using R for maaslin2 to create multiple associations with bacterial absolute counts and bunch of other features. I can only run maaslin2 for fixed effects up to 3 features. It does not calculate more than that. When fixed_effects=NULL, then no associations found. However, I know there are associations in some features confirmed with one or up to 3 multiple effects, but when altogether combined no heatmap produced. I just dont understand why the program does not omit the non-associated feature effects and create heatmap. I can understand when all features combined maybe there is no association, but, why I can only run up to 3 features? The tutorial contains more than 3. Here is my code!

fit_data = Maaslin2(
input_data = df_input_data,
input_metadata = df_input_metadata,
output = “output_noRM”,
fixed_effects = NULL, random_effects = NULL, normalization = ‘NONE’, transform= “LOG”, standardize = TRUE, plot_heatmap = TRUE, plot_scatter = TRUE, heatmap_first_n = 50)

I tried every other possible options for normalization, transform and standardize.

Hi @eo24a,

I’m not sure I understand your question. There are two ways to tell MaAsLin which values you want as fixed effects:

  1. Leave the call for fixed effects blank (do not even put fixed_effects = XYZ) and it will run every column in your metadata as a fixed effect. OR
  2. Provide a list (e.g. fixed_effects = c(effect1, effect2, effect3, effect4, effect5)) of the column names that you want MaAsLin to consider as fixed effects.

MaAsLin can certainly handle over three fixed effects. Have you tried to run MaAsLin in either of these two manners? Additionally, at times adding variables that correlate to the model can cause loss of power and you longer detect significant associations, this could also be what you are observing here. If that is the case - you would want to ask yourself what you are adding to the model with the additional regressions - is that additional variable something that you need to adjust for? Are you adding something that is completely confounded? etc.

If you are still stuck, can you provide a minimally reproducible example so that I can replicate what you are describing?

Best,
Kelsey

Hi Kelsey,

Thank you for your response! I did try your suggestion 1. The output I am getting is that “There are no associations to plot”

Here is the output for that
2021-11-15 12:05:37 INFO::Formula for fixed effects: expr ~ High.Oleic.Safflower.Oil + Coconut.Oil + High.Oleic.Sunflower.Oil + Palm.Oil + M…Alpina.Oil + Soy.Lecithin + Saturated + Monounsaturated + Polyunsaturated + Medium.Chain.Triglycerides + Linoleic.acid + Linolenic.acid + Dietary.Fiber + Lactose + Short.Chain.Fructooligosaccharide + Fucosyllactose + Corn.Syrup.Solids + Chicory.Root.Inulin + High.Amylose.Corn.Starch + Gum.Arabic + Microcrystaline.Cellulose + Soy.Fiber + Maltodextrin + Fat + Carbohydrate + Protein
2021-11-15 12:05:37 INFO::Running selected normalization method: NONE
2021-11-15 12:05:37 INFO::Filter data based on min abundance and min prevalence
2021-11-15 12:05:37 INFO::Total samples in data: 5
2021-11-15 12:05:37 INFO::Min samples required with min abundance for a feature not to be filtered: 0.500000
2021-11-15 12:05:37 INFO::Total filtered features: 0
2021-11-15 12:05:37 INFO::Filtered feature names:
2021-11-15 12:05:37 INFO::Applying z-score to standardize continuous metadata
2021-11-15 12:05:37 INFO::Running selected transform method: LOG
2021-11-15 12:05:37 INFO::Running selected analysis method: LM
| | 0 % ~calculating 2021-11-15 12:05:37 INFO::Fitting model to feature number 1, B. infantis
|++++++ | 11% ~01s 2021-11-15 12:05:37 INFO::Fitting model to feature number 2, B. longum
|++++++++++++ | 22% ~00s 2021-11-15 12:05:37 INFO::Fitting model to feature number 3, B. breve
|+++++++++++++++++ | 33% ~00s 2021-11-15 12:05:37 INFO::Fitting model to feature number 4, B. vulgatus
|+++++++++++++++++++++++ | 44% ~00s 2021-11-15 12:05:37 INFO::Fitting model to feature number 5, C. perfringens
|++++++++++++++++++++++++++++ | 56% ~00s 2021-11-15 12:05:37 INFO::Fitting model to feature number 6, E. coli
|++++++++++++++++++++++++++++++++++ | 67% ~00s 2021-11-15 12:05:37 INFO::Fitting model to feature number 7, E. faecalis
|+++++++++++++++++++++++++++++++++++++++ | 78% ~00s 2021-11-15 12:05:37 INFO::Fitting model to feature number 8, K. pneumoniae
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~00s 2021-11-15 12:05:38 INFO::Fitting model to feature number 9, B. fragilis
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
2021-11-15 12:05:38 INFO::Counting total values for each feature
2021-11-15 12:05:38 WARNING::Deleting existing residuals file: output_nondigested_ingredient_noRM/residuals.rds
2021-11-15 12:05:38 INFO::Writing residuals to file output_nondigested_ingredient_noRM/residuals.rds
2021-11-15 12:05:38 INFO::Writing all results to file (ordered by increasing q-values): output_nondigested_ingredient_noRM/all_results.tsv
2021-11-15 12:05:38 INFO::Writing the significant results (those which are less than or equal to the threshold of 0.250000 ) to file (ordered by increasing q-values): output_nondigested_ingredient_noRM/significant_results.tsv
2021-11-15 12:05:38 INFO::Writing heatmap of significant results to file: output_nondigested_ingredient_noRM/heatmap.pdf
[1] “There are no associations to plot!”
2021-11-15 12:05:38 INFO::Writing association plots (one for each significant association) to output folder: output_nondigested_ingredient_noRM
[1] “There are no associations to plot!”

For the 2nd suggestion: I want to compare 4 factors. Here is my command:

fit_data = Maaslin2(

  • input_data = df_input_data, 
    
  • input_metadata = df_input_metadata, 
    
  • output = "output_nondigested_ingredient_noRM", fixed_effects= c("Fat", "Carbohydrate", "Dietary.Fiber", "Protein"), random_effects = NULL, normalization = 'NONE', transform= "LOG", standardize = TRUE, plot_heatmap = TRUE, plot_scatter = TRUE, heatmap_first_n = 10)
    

In the output:
2021-11-15 12:07:04 INFO::Formula for fixed effects: expr ~ Fat + Carbohydrate + Dietary.Fiber + Protein
2021-11-15 12:07:04 INFO::Running selected normalization method: NONE
2021-11-15 12:07:04 INFO::Filter data based on min abundance and min prevalence
2021-11-15 12:07:04 INFO::Total samples in data: 5
2021-11-15 12:07:04 INFO::Min samples required with min abundance for a feature not to be filtered: 0.500000
2021-11-15 12:07:04 INFO::Total filtered features: 0
2021-11-15 12:07:04 INFO::Filtered feature names:
2021-11-15 12:07:04 INFO::Applying z-score to standardize continuous metadata
2021-11-15 12:07:04 INFO::Running selected transform method: LOG
2021-11-15 12:07:04 INFO::Running selected analysis method: LM
| | 0 % ~calculating 2021-11-15 12:07:04 INFO::Fitting model to feature number 1, B. infantis
|++++++ | 11% ~00s 2021-11-15 12:07:04 INFO::Fitting model to feature number 2, B. longum
|++++++++++++ | 22% ~00s 2021-11-15 12:07:04 INFO::Fitting model to feature number 3, B. breve
|+++++++++++++++++ | 33% ~00s 2021-11-15 12:07:04 INFO::Fitting model to feature number 4, B. vulgatus
|+++++++++++++++++++++++ | 44% ~00s 2021-11-15 12:07:04 INFO::Fitting model to feature number 5, C. perfringens
|++++++++++++++++++++++++++++ | 56% ~00s 2021-11-15 12:07:04 INFO::Fitting model to feature number 6, E. coli
|++++++++++++++++++++++++++++++++++ | 67% ~00s 2021-11-15 12:07:04 INFO::Fitting model to feature number 7, E. faecalis
|+++++++++++++++++++++++++++++++++++++++ | 78% ~00s 2021-11-15 12:07:04 INFO::Fitting model to feature number 8, K. pneumoniae
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~00s 2021-11-15 12:07:04 INFO::Fitting model to feature number 9, B. fragilis
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
2021-11-15 12:07:04 INFO::Counting total values for each feature
2021-11-15 12:07:04 WARNING::Deleting existing residuals file: output_nondigested_ingredient_noRM/residuals.rds
2021-11-15 12:07:04 INFO::Writing residuals to file output_nondigested_ingredient_noRM/residuals.rds
2021-11-15 12:07:04 INFO::Writing all results to file (ordered by increasing q-values): output_nondigested_ingredient_noRM/all_results.tsv
2021-11-15 12:07:04 INFO::Writing the significant results (those which are less than or equal to the threshold of 0.250000 ) to file (ordered by increasing q-values): output_nondigested_ingredient_noRM/significant_results.tsv
2021-11-15 12:07:04 INFO::Writing heatmap of significant results to file: output_nondigested_ingredient_noRM/heatmap.pdf
2021-11-15 12:07:05 INFO::Writing association plots (one for each significant association) to output folder: output_nondigested_ingredient_noRM
2021-11-15 12:07:05 INFO::Plotting associations from most to least significant, grouped by metadata
2021-11-15 12:07:05 INFO::Plotting data for metadata number 1, Dietary.Fiber
2021-11-15 12:07:05 INFO::Creating scatter plot for continuous data, Dietary.Fiber vs C. perfringens
2021-11-15 12:07:06 INFO::Creating scatter plot for continuous data, Dietary.Fiber vs B. vulgatus
2021-11-15 12:07:06 INFO::Creating scatter plot for continuous data, Dietary.Fiber vs K. pneumoniae
2021-11-15 12:07:06 INFO::Plotting data for metadata number 2, Fat
2021-11-15 12:07:06 INFO::Creating scatter plot for continuous data, Fat vs C. perfringens
2021-11-15 12:07:06 INFO::Creating scatter plot for continuous data, Fat vs B. longum
2021-11-15 12:07:06 INFO::Creating scatter plot for continuous data, Fat vs B. breve
2021-11-15 12:07:07 INFO::Creating scatter plot for continuous data, Fat vs K. pneumoniae
2021-11-15 12:07:07 INFO::Creating scatter plot for continuous data, Fat vs B. vulgatus
2021-11-15 12:07:07 INFO::Plotting data for metadata number 3, Carbohydrate
2021-11-15 12:07:07 INFO::Creating scatter plot for continuous data, Carbohydrate vs C. perfringens
2021-11-15 12:07:07 INFO::Creating scatter plot for continuous data, Carbohydrate vs B. breve
2021-11-15 12:07:07 INFO::Creating scatter plot for continuous data, Carbohydrate vs E. faecalis
2021-11-15 12:07:08 INFO::Creating scatter plot for continuous data, Carbohydrate vs B. fragilis
2021-11-15 12:07:08 INFO::Creating scatter plot for continuous data, Carbohydrate vs B. longum
2021-11-15 12:07:08 INFO::Creating scatter plot for continuous data, Carbohydrate vs K. pneumoniae

So it stops here and does not continue further. Plots only Fat, Dietary.Fiber, Carbohydrate but no Protein. Then, I change the order of the fixed effects in my command, putting protein earlier than dietary fiber

fit_data = Maaslin2(
input_data = df_input_data,
input_metadata = df_input_metadata,
output = “output_nondigested_ingredient_noRM”, fixed_effects= c(“Fat”, “Protein”, “Carbohydrate”, “Dietary.Fiber”), random_effects = NULL, normalization = ‘NONE’, transform= “LOG”, standardize = TRUE, plot_heatmap = TRUE, plot_scatter = TRUE, heatmap_first_n = 10)

Then the plot is only fat, protein carbohydrate.
2021-11-15 12:15:40 INFO::Writing association plots (one for each significant association) to output folder: output_nondigested_ingredient_noRM
2021-11-15 12:15:40 INFO::Plotting associations from most to least significant, grouped by metadata
2021-11-15 12:15:40 INFO::Plotting data for metadata number 1, Fat
2021-11-15 12:15:40 INFO::Creating scatter plot for continuous data, Fat vs C. perfringens
2021-11-15 12:15:40 INFO::Creating scatter plot for continuous data, Fat vs B. vulgatus
2021-11-15 12:15:40 INFO::Creating scatter plot for continuous data, Fat vs K. pneumoniae
2021-11-15 12:15:40 INFO::Plotting data for metadata number 2, Protein
2021-11-15 12:15:40 INFO::Creating scatter plot for continuous data, Protein vs C. perfringens
2021-11-15 12:15:40 INFO::Creating scatter plot for continuous data, Protein vs B. vulgatus
2021-11-15 12:15:41 INFO::Creating scatter plot for continuous data, Protein vs K. pneumoniae
2021-11-15 12:15:41 INFO::Plotting data for metadata number 3, Carbohydrate
2021-11-15 12:15:41 INFO::Creating scatter plot for continuous data, Carbohydrate vs C. perfringens
2021-11-15 12:15:41 INFO::Creating scatter plot for continuous data, Carbohydrate vs B. vulgatus
2021-11-15 12:15:41 INFO::Creating scatter plot for continuous data, Carbohydrate vs K. pneumoniae

What I am saying it does not run the 4th fixed effect no matter what it is.

I am including my data, see if you can replicate that. Am I missing something in my R command?
abundance_lefse.txt (2.9 KB)
metadata_noRM.txt (878 Bytes)