Hi Yancong,
I think there is a bug in metawibele_prepare_uniprot_annotation (metawibele/common/prepare_uniprot_annotation.py),
This Python script first downloads uniprot_sprot.dat.gz and uniprot_trembl.dat.gz, then combines them into a file called “uniprot.dat.gz”, but the command line for combining two files does not work well because these commands compress the uniprot.dat.gz twice.
Kind regards,
Stephen
Hi Stephen,
Thanks for your interest! Could I ask which version you used? By testing our latest version (v0.4.8 GitHub - biobakery/metawibele: MetaWIBELE: Workflow to Identify novel Bioactive Elements in microbiome), I didn’t find this issue, where the downloaded uniprot_sprot.dat.gz
and uniprot_trembl.dat.gz
were combined into uniprot.dat
and then compressed into uniprot.dat.gz
Best,
Yancong
Hi Yancong,
I used v0.4.7.
I found the problem is that “less” was used to decompress and combine uniprot_sprot.dat.gz and uniprot_trembl.dat.gz, but in some Linux distribution, less doesn’t automatically decompress gzipped files (gzip - "less" doesn't automatically decompress gzipped files - Ask Ubuntu), so in my Linux distribution (ubuntu server, zsh), it doesn’t work well.
It may be more efficient to simply “cat uniprot_sprot.dat.gz uniprot_trembl.dat.gz > uniprot.dat.gz” to concatenate these two gzipped files instead of decompressing, combining, and then compressing them once more. (Advanced usage (GNU Gzip))
Kind regards,
Stephen
Hi Stephen,
Thanks for this information! I fixed this issue to make it work on more types of Linux distributions (including ubantu). Please check out our latest version on the Github and let me know if you have any other questions.
Best,
Yancong