Maaslin input error

Hi, I keep getting this error while trying to use Maaslin in R:

Error in FUN(newX[, i], ...) : invalid 'type' (character) of argument

I don’t quite understand what I’m doing wrong. I feel like it should work; the sample names are exactly the same for both folders, and I even ran colname == colname to check. I also tried transposing the dataframes and got the same error output.

Here is the code I ran too:

gfr <- read.table("all-samples/function/all-gfr.tsv", sep = '\t', header=TRUE)
colnames(gfr) <- c("Gene Families", "AT-2", "PS-1", "PS-2", "PS-3", "TN-1", "TN-2", "TN-3")
metadata <- read.csv("WPS-2_metagenome_mapping.csv")
colnames(metadata) <- c("Variables", "AT-2", "PS-1", "PS-2", "PS-3", "TN-1", "TN-2", "TN-3")
colnames(metadata) == colnames(gfr)

Maaslin2(input_data = gfr, input_metadata = metadata, output = "all-samples/maaslin")

Attached are images of the dataframe I’ve imported to R.


I also tried removing the first column header (Gene Families and Variables) and I got this error:

Error in `[.data.frame`(metadata, , effects_names, drop = FALSE) : 
  undefined columns selected

Could someone please help me define my error and what I should do instead to analyse my data? Also, is it possible to use the other HUMANN functional outputs such as path abundances?

What I’m specifically looking to get is a comparison of the gene family abundances between the different samples. What gene families/pathways are found across all samples, and if there are significant differences in the abundances of specific gene families/pathways between different samples.

Thank you and much appreciated.

Hi @sarahi,

Thanks for the questions, for the errors - are you setting the genefamilies as row.names or are they a separate column (e.g. row.names(gfr) = grf$'Gene Families' - then remove the genefamilies column)? My best guess without playing with the data is that MaAsLin is tripping over the gene families column because it assumes that everything in the df should be numeric for the input data. If that doesn’t work - can you send a reproducible example, either here or to my email, so that I can replicate what you are seeing?

Yes - you can use all the HUMAnN output in MaAsLin - the interpretation of the pathway coverage is the only one that isn’t straightforward. Please see our tutorial for more examples of using MaAsLin on functional data. MaAsLin won’t tell you what is core to your samples but should be able to tell you what is differential.

Best,
Kelsey

1 Like

Hi Kelsey, just wanted to update you (and other readers) that this really helped, so thank you very much! Here is the code if anyone is interested in the future:

kegg.f <- read.table("krakened/all-cpm-kegg_unstratified.tsv", sep = '\t')
colnames(kegg.f) <- c("RXN", "AT-2", "PS-1", "PS-2", "PS-3", "TN-1", "TN-2", "TN-3")
row.names(kegg.f) = kegg.f$RXN
kegg.f$RXN<-NULL