HAllA no FDR adjustment

Hi everyone,

Is it possible to run the newest HAllA version without any FDR adjustment?

Thank you in advance,
Simon

Sorry, the fdr_method argument only accepts the same set of values allowed by statsmodels.stats.multitest.multipletests :

https://www.statsmodels.org/dev/generated/statsmodels.stats.multitest.multipletests.html

1 Like

Thank you for your reply! Is it somehow possible to implement the argument “no_adjustment” from ealier HAllA legacy versions or to install an earlier release of HAllA?

I think it would be relatively simple to add a control flow block to this function, but I don’t have bandwidth right now to implement and test it. Something like:

if method == "none": return(pvalues)

If you’re comfortable tinkering with the code and installing from source, you could give that a try.

Thank you very much! I added a control flow block to the function like this:

def pvalues2qvalues(pvalues, method=“none”, alpha=0.05):
‘’‘Perform p-value correction for multiple tests (Benjamini/Hochberg)
Args:
- pvalues: a 1D-array of pvalues
- alpha : family-wise error rate
Return a tuple (adjusted p-value array, boolean array [True = reject]
‘’’
if method == “none”:
return(pvalues)
return(multipletests(pvalues, alpha=alpha, method=method)[:2])

Then I created a new conda enviroment and set it up for HAllA by running:

python3 setup.py develop

If I run HAllA now, the added control flow block seems to be recognized as reported by the config parameters below:

halla -x crohn_bacterial_species.txt -y crohn_fungal_species.txt -o Crohn_Species_none --fdr_method=“none”
"
Setting config parameters (irrelevant parameters will be ignored)…
preprocess:
max freq thresh : 1
transform funcs : None
discretize bypass if possible : True
discretize func : None
discretize num bins : None
association:
pdist metric : spearman
hierarchy:
sim2dist set abs : True
sim2dist func : None
linkage method : average
permute:
iters : 1000
func : gpd
speedup : True
stats:
fdr alpha : 0.05
fdr method : none
fnr thresh : 0.2
rank cluster : best
output:
dir : Crohn_Species_none
verbose : True
== Loading and preprocessing data ==
Preprocessing step completed:

  • X shape (# features, # size) (32, 17)
  • Y shape (# features, # size) (16, 17)
    == Completed; total duration: 0:00:00.024488 ==

== Performing HAllA ==
– Step 1: Computing pairwise similarities, p-values, and q-values –
Generating the similarity table…
Generating the p-value table…
100%|██████████████████████████████████████████| 32/32 [00:00<00:00, 207.13it/s]
Generating the q-value table…
Traceback (most recent call last):
File “/usr/local/bin/halla”, line 11, in
load_entry_point(‘HAllA’, ‘console_scripts’, ‘halla’)()
File “/home/simonw/Schreibtisch/R Scripts/DNA Sequencing/5_ High-sensitivity pattern discovery (HAllA)/halla-master/scripts/halla.py”, line 258, in main
instance.run()
File “/home/simonw/Schreibtisch/R Scripts/DNA Sequencing/5_ High-sensitivity pattern discovery (HAllA)/halla-master/halla/main.py”, line 395, in run
self.compute_pairwise_similarities()
File "/home/simonw/Schreibtisch/R Scripts/DNA Sequencing/5
High-sensitivity pattern discovery (HAllA)/halla-master/halla/main.py", line 112, in _compute_pairwise_similarities
self.fdr_reject_table, self.qvalue_table = pvalues2qvalues(self.pvalue_table.flatten(), config.stats[‘fdr_method’], config.stats[‘fdr_alpha’])
ValueError: too many values to unpack (expected 2)
Error in sys.excepthook:
Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/apport_python_hook.py”, line 72, in apport_excepthook
from apport.fileutils import likely_packaged, get_recent_crashes
File “/usr/lib/python3/dist-packages/apport/init.py”, line 5, in
from apport.report import Report
File “/usr/lib/python3/dist-packages/apport/report.py”, line 13, in
import fnmatch, glob, traceback, errno, sys, atexit, locale, imp, stat
File “/usr/lib/python3.8/imp.py”, line 31, in
warnings.warn("the imp module is deprecated in favour of importlib; "
DeprecationWarning: the imp module is deprecated in favour of importlib; see the module’s documentation for alternative uses

Original exception was:
Traceback (most recent call last):
File “/usr/local/bin/halla”, line 11, in
load_entry_point(‘HAllA’, ‘console_scripts’, ‘halla’)()
File “/home/simonw/Schreibtisch/R Scripts/DNA Sequencing/5_ High-sensitivity pattern discovery (HAllA)/halla-master/scripts/halla.py”, line 258, in main
instance.run()
File “/home/simonw/Schreibtisch/R Scripts/DNA Sequencing/5_ High-sensitivity pattern discovery (HAllA)/halla-master/halla/main.py”, line 395, in run
self.compute_pairwise_similarities()
File "/home/simonw/Schreibtisch/R Scripts/DNA Sequencing/5
High-sensitivity pattern discovery (HAllA)/halla-master/halla/main.py", line 112, in _compute_pairwise_similarities
self.fdr_reject_table, self.qvalue_table = pvalues2qvalues(self.pvalue_table.flatten(), config.stats[‘fdr_method’], config.stats[‘fdr_alpha’])
ValueError: too many values to unpack (expected 2)

However, this error occurs after that, implying that there are to many values to unpack. Do you have any idea, how to solve this?

Best,
Simon

From this part

self.fdr_reject_table, self.qvalue_table = pvalues2qvalues(self.pvalue_table.flatten(), config.stats[‘fdr_method’], config.stats[‘fdr_alpha’])
ValueError: too many values to unpack (expected 2)

It looks like naively returning the input doesn’t give the right size or type or something. Try running that multipletests()[:2] function with some test data to figure out what a typical return value is supposed to look like.

Seems like multipletests returns a tuple:

#Return a tuple (adjusted p-value array, boolean array [True = reject])

I tried this, which didnt’t work:

if method == "none": 
    return(tuple(pvalues))

Does this error maybe occur, because the loop doesn’t create a tuple with the boolean array [True =reject]?

If I run multipletests with test data…

pvals=[0.5,0.3,0.5]
s.multipletests(pvals)

I get this output:
(array([False, False, False]), array([0.75 , 0.657, 0.75 ]), 0.016952427508441503, 0.016666666666666666)