Issue with generating the chemical taxonomy file in MACARRoN

I am struggling to generate the chemical taxonomy file using the “decorateID” function within MACARRoN, using both the RStudio and command line platforms. When I run the decorateID command, it runs for about 15 minutes, then times out with the error:

“Error in open.connection(x, “rb”) :
Timeout was reached: [hmdb.ca] Connection timeout after 10000 ms”

However, when I run through the MACARRoN tutorial, using the example smaller dataset, I have no issues. Indeed, when I run the decorateID command on a subset of my annotations file, it runs with no issues. My thought is that my computer is not powerful enough on its own to run decorateID on my dataset of about 11,000 untargeted metabolites. I would like to try running this command through another lab’s server to increase my power, but would like to know what the estimated memory needed to run decorateID on a typical metabolomic dataset is before I try and potentially crash their server. Has anyone else run into this problem, or would recommend generating the chemical taxonomy file a different way? Thank you in advance!

Hi,
Thanks for your question.
That is indeed a problem some of us have been facing as well and we are going to include a potential solution in the next version of MACARRoN.
The easiest fix right now is to break the annotation file into smaller parts, run decorateID on each, and then put them back together with rbind or equivalent. decorateID finds taxonomy information for only the annotated features using either the HMDB or PubChem ID. May I ask how many of the 11000 features are annotated?