I think it is better to replace usearch (32 bit) by other tools if it is just used for sorting contig by length, when I run the preprocessing step of metawibele, it throws an error because my fasta file is too big( the limitation of 32 bit usearch). I have to manually change the code related usearch and replace it with another tool.
Thanks for your feedback! That sounds a good suggestion. I added it to our list for the future releases.
Best,
Yancong
Hello, Yancong.
I have the same problem that usearch command had been disrupted due to the limitation of file size (my file size = 4.8Gb from 20 paired samples).
Do you have any idea how to solve this problem? Should I run a preprocess repeatedly by subgrouping samples? (If I have to, how can I merge the preprocessed outputs into one?)
Or, as Stephen mentioned, could you recommend any possible tool and where the code I should fix to run it?
Thank you for your help in advance.
Gihyeon.
Hi Gihyeon,
The 32-bit usearch is limited to using 4Gb or less memory. The licensed 64-bit usearch would work for large files.
To address these limitations, we have replaced usearch with seqkit (an open-source tool: SeqKit - Ultrafast FASTA/Q kit) in our latest version v0.4.8 (Release 0.4.8 · biobakery/metawibele · GitHub). MetaWIBELE v0.4.8 is still under testing and hasn’t been released to pip/conda/docker. Please try the new version if you are interested.
Thanks!
Yancong
Hi Yancong,
Thank you for your prompt response.
It’s good to hear that the new version is being updated!
I will try it soon.
Many thanks,
Gihyeon