The bioBakery help forum

Uncertainty about Execution time of Halla (As for my data it‘s much time-consuming)

Dear Halla:

I have encountered an problem when running latest Halla 0.8.17 : For now, it has been taking around 14 hours to conduct similarity calculation between two datasets features calculation.

I am wonder an expected estimation or acceptable time interval for my task, to help take normal and abnormal conditions apart.

I have assign 8 cores (with --nproc para) but it does not work under WSL environment of Windows 10 of my laptop.

I have list my data, command, and system info:

20 features and 15 samples are used from first datesets

19837 features and 15 samples are used from second datesets

Output files will be written to: C:\Users\path\dir

0:03:14.560000 h:m:s similarity caluclation between two datasets features time —

  • Intel I7-4720HQ and 16 GB memory with 18% and 40% occupation respectively.

  • Run program in python2.7 environment created by Anaconda.

  • Example data run swiftly and no Errors.

  • Follow the instructor

halla  -X  huichang_2020-02-18_genus_halla.txt
-Y huichang_2020-02-18_genes_halla.txt  
--nproc 8
-o pouchitis_output
-m spearman   --header    -q 0.05

Thank you very much. Any suggestion is welcome. I look forward for your reply.

Best regards,

Derek