Usearch in metawibele

I think it is better to replace usearch (32 bit) by other tools if it is just used for sorting contig by length, when I run the preprocessing step of metawibele, it throws an error because my fasta file is too big( the limitation of 32 bit usearch). I have to manually change the code related usearch and replace it with another tool.

Thanks for your feedback! That sounds a good suggestion. I added it to our list for the future releases.

Best,
Yancong

Hello, Yancong.

I have the same problem that usearch command had been disrupted due to the limitation of file size (my file size = 4.8Gb from 20 paired samples).
Do you have any idea how to solve this problem? Should I run a preprocess repeatedly by subgrouping samples? (If I have to, how can I merge the preprocessed outputs into one?)
Or, as Stephen mentioned, could you recommend any possible tool and where the code I should fix to run it?

Thank you for your help in advance.
Gihyeon.

Hi Gihyeon,

The 32-bit usearch is limited to using 4Gb or less memory. The licensed 64-bit usearch would work for large files.

To address these limitations, we have replaced usearch with seqkit (an open-source tool: SeqKit - Ultrafast FASTA/Q kit) in our latest version v0.4.8 (Release 0.4.8 · biobakery/metawibele · GitHub). MetaWIBELE v0.4.8 is still under testing and hasn’t been released to pip/conda/docker. Please try the new version if you are interested.

Thanks!
Yancong

Hi Yancong,

Thank you for your prompt response.
It’s good to hear that the new version is being updated!
I will try it soon.

Many thanks,
Gihyeon