Hi Yancong,
I used demo files(demo_genecatalogs.centroid.faa, demo_genecatalogs_counts.all.tsv and demo_mgx_metadata.tsv) to run the characterize flow and use the generated results to run prioritization flow. However, the output of my prioritization process only has the results of unsupervised clustering. I read the question of a previous questioner and your answer, and I know that it is because of the setting phenotype=none
in the metawibele.cfg file.
However, I have read the introduction about the config file settings, but I still don’t know how to set the phenotype. I tried to three times, use phentype=CD
, phentype=CD_nonIBD
, phentype=CD, nonIBD
, respectively, but they all reported errors. If I want to run the demo file with sample phenotypes of CD and nonIBD, how should I set the phenotype in metawibele.cfg?
Wenjin
Hi Wenjin,
Yes, you need to set phenotype
in metawibele.cfg. Just feed it with a metadata variable name rather than value, e.g.phenotype=diagnosis
.
Feel free to check the example config for a demo run on our tutorial page: https://raw.githubusercontent.com/biobakery/metawibele/master/examples/metawibele.cfg
Thanks!
Yancong
Hi Yancong,
Sorry to bother you again. I used the example config for a demo run, but errors still occurred in Masslin2 process:
Task 73 failed
Name: maaslin2
Original error:
Failed to produce target `/parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/all_results.fdr_correction.correct_per_level.tsv'. Original exception: Traceback (most recent call last):
File "/parastor/home/zhangwj02/miniconda3/envs/mw/lib/python3.7/site-packages/anadama2/runners.py", line 219, in _get_task_result
targ_compares.append(list(target.compare()))
File "/parastor/home/zhangwj02/miniconda3/envs/mw/lib/python3.7/site-packages/anadama2/tracked.py", line 379, in compare
stat = os.stat(self.name)
FileNotFoundError: [Errno 2] No such file or directory: '/parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/all_results.fdr_correction.correct_per_level.tsv'
I checked the ./abundance_annotation/DA/maaslin2_metadata.results.log, and there had been an error related to the object ‘current_args’:
11/06/2024 06:45:49 PM - metawibele.config - INFO: Run command: metawibele_transpose < /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10.pcl > /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10.tsv
11/06/2024 06:45:49 PM - metawibele.config - INFO: Run command: Maaslin2.R /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10.tsv /parastor/home/zhangwj02/MetaWIBELE/test/demo_mgx_metadata.tsv /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10 --min_abundance 0.0 --min_prevalence 0.1 --min_variance 0.0 --max_significance 0.25 --normalization NONE --transform LOG --analysis_method LM --cores 4 --fixed_effects diagnosis,consent_age,antibiotic,immunosuppressant,mesalamine,steroids --random_effects none --correction BH --standardize TRUE --plot_heatmap FALSE --heatmap_first_n FALSE --plot_scatter FALSE --reference diagnosis,nonIBD
There were 13 warnings (use warnings() to see them)
[1] "Creating output folder"
[1] "Creating output feature tables folder"
[1] "Creating output fits folder"
2024-11-06 18:45:53.064589 INFO::Writing function arguments to log file
2024-11-06 18:45:53.158286 INFO::Verifying options selected are valid
2024-11-06 18:45:53.162158 INFO::Determining format of input files
2024-11-06 18:45:53.165559 INFO::Input format is data samples as rows and metadata samples as rows
2024-11-06 18:45:53.176158 WARNING::Feature name not found in metadata so not applied to formula as random effect: none
2024-11-06 18:45:53.18006 INFO::Formula for fixed effects: expr ~ diagnosis + consent_age + antibiotic + immunosuppressant + mesalamine + steroids
2024-11-06 18:45:53.186787 INFO::Filter data based on min abundance and min prevalence
2024-11-06 18:45:53.191537 INFO::Total samples in data: 102
2024-11-06 18:45:53.195157 INFO::Min samples required with min abundance for a feature not to be filtered: 10.200000
2024-11-06 18:45:53.200236 INFO::Total filtered features: 0
2024-11-06 18:45:53.203883 INFO::Filtered feature names from abundance and prevalence filtering:
2024-11-06 18:45:53.208725 INFO::Total filtered features with variance filtering: 0
2024-11-06 18:45:53.212281 INFO::Filtered feature names from variance filtering:
2024-11-06 18:45:53.215625 INFO::Running selected normalization method: NONE
2024-11-06 18:45:53.219257 INFO::Applying z-score to standardize continuous metadata
2024-11-06 18:45:53.276205 INFO::Running selected transform method: LOG
2024-11-06 18:45:53.286913 INFO::Running selected analysis method: LM
2024-11-06 18:45:53.535234 INFO::Creating cluster of 4 R processes
Error in checkForRemoteErrors(val) :
4 nodes produced errors; first error: object 'current_args' not found
Calls: Maaslin2 ... clusterApply -> staticClusterApply -> checkForRemoteErrors
In addition: Warning message:
In value[[3L]](cond) : double expected, got “FALSE”
Execution halted
/parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10/all_results.tsv: No such file or directory
I don’t know what caused this. I would be grateful if you could help me.
Wenjin
Hi Wenjin,
I am wondering if the installation of Masslin2 dependency was well set or not. To debug,
- Do you test if Maaslin2 works using command line? Please make sure you can run it smoothly (GitHub - biobakery/Maaslin2: MaAsLin2: Microbiome Multivariate Association with Linear Models)
- Check if the Maaslin2 version you installed is compatible to MetaWIBELE. Both Maaslin2 versions v1.5.1 and v1.7.3 have been tested and are working well so far.
Thanks!
Yancong
Hi Yancong,
I ran Maaslin2 separately, using the codes in the command line
Maaslin2.R --fixed_effects="diagnosis,dysbiosisnonIBD,dysbiosisUC,dysbiosisCD,antibiotics,age" --random_effects="site,subject" --standardize=FALSE --reference="diagnosis,nonIBD" inst/extdata/HMP2_taxonomy.tsv inst/extdata/HMP2_metadata.tsv demo_output
It generated these results:
Masslin2.log did not report an error. I downloaded the latest Maaslin2.master.zip from masslin’s github, unzipped it, and installed a series of R packages that masslin2 depends on. The version is Version: 1.15.1. How can I debug?
Wenjin
Hi Wenjin,
To check if the installed masslin2 works with multiple CPU cores, could you try the maaslin2 command used in your MetaWIBELE demo run? i.e.
Maaslin2.R /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10.tsv /parastor/home/zhangwj02/MetaWIBELE/test/demo_mgx_metadata.tsv /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10 --min_abundance 0.0 --min_prevalence 0.1 --min_variance 0.0 --max_significance 0.25 --normalization NONE --transform LOG --analysis_method LM --cores 4 --fixed_effects diagnosis,consent_age,antibiotic,immunosuppressant,mesalamine,steroids --random_effects none --correction BH --standardize TRUE --plot_heatmap FALSE --heatmap_first_n FALSE --plot_scatter FALSE --reference diagnosis,nonIBD
It appears that you’ve set 4 CPU cores in the configuration for Masslin2. Please ensure that you’ve requested the corresponding computational resources to run the demo on your server.
Thanks!
Yancong
Hi Yancong,
Problem solved, the error was caused by the wrong CPU cores I set. Thanks!
Wenjin