Hi Yancong,
I used demo files(demo_genecatalogs.centroid.faa, demo_genecatalogs_counts.all.tsv and demo_mgx_metadata.tsv) to run the characterize flow and use the generated results to run prioritization flow. However, the output of my prioritization process only has the results of unsupervised clustering. I read the question of a previous questioner and your answer, and I know that it is because of the setting phenotype=none
in the metawibele.cfg file.
However, I have read the introduction about the config file settings, but I still don’t know how to set the phenotype. I tried to three times, use phentype=CD
, phentype=CD_nonIBD
, phentype=CD, nonIBD
, respectively, but they all reported errors. If I want to run the demo file with sample phenotypes of CD and nonIBD, how should I set the phenotype in metawibele.cfg?
Wenjin
Hi Wenjin,
Yes, you need to set phenotype
in metawibele.cfg. Just feed it with a metadata variable name rather than value, e.g.phenotype=diagnosis
.
Feel free to check the example config for a demo run on our tutorial page: https://raw.githubusercontent.com/biobakery/metawibele/master/examples/metawibele.cfg
Thanks!
Yancong
Hi Yancong,
Sorry to bother you again. I used the example config for a demo run, but errors still occurred in Masslin2 process:
Task 73 failed
Name: maaslin2
Original error:
Failed to produce target `/parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/all_results.fdr_correction.correct_per_level.tsv'. Original exception: Traceback (most recent call last):
File "/parastor/home/zhangwj02/miniconda3/envs/mw/lib/python3.7/site-packages/anadama2/runners.py", line 219, in _get_task_result
targ_compares.append(list(target.compare()))
File "/parastor/home/zhangwj02/miniconda3/envs/mw/lib/python3.7/site-packages/anadama2/tracked.py", line 379, in compare
stat = os.stat(self.name)
FileNotFoundError: [Errno 2] No such file or directory: '/parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/all_results.fdr_correction.correct_per_level.tsv'
I checked the ./abundance_annotation/DA/maaslin2_metadata.results.log, and there had been an error related to the object ‘current_args’:
11/06/2024 06:45:49 PM - metawibele.config - INFO: Run command: metawibele_transpose < /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10.pcl > /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10.tsv
11/06/2024 06:45:49 PM - metawibele.config - INFO: Run command: Maaslin2.R /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10.tsv /parastor/home/zhangwj02/MetaWIBELE/test/demo_mgx_metadata.tsv /parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10 --min_abundance 0.0 --min_prevalence 0.1 --min_variance 0.0 --max_significance 0.25 --normalization NONE --transform LOG --analysis_method LM --cores 4 --fixed_effects diagnosis,consent_age,antibiotic,immunosuppressant,mesalamine,steroids --random_effects none --correction BH --standardize TRUE --plot_heatmap FALSE --heatmap_first_n FALSE --plot_scatter FALSE --reference diagnosis,nonIBD
There were 13 warnings (use warnings() to see them)
[1] "Creating output folder"
[1] "Creating output feature tables folder"
[1] "Creating output fits folder"
2024-11-06 18:45:53.064589 INFO::Writing function arguments to log file
2024-11-06 18:45:53.158286 INFO::Verifying options selected are valid
2024-11-06 18:45:53.162158 INFO::Determining format of input files
2024-11-06 18:45:53.165559 INFO::Input format is data samples as rows and metadata samples as rows
2024-11-06 18:45:53.176158 WARNING::Feature name not found in metadata so not applied to formula as random effect: none
2024-11-06 18:45:53.18006 INFO::Formula for fixed effects: expr ~ diagnosis + consent_age + antibiotic + immunosuppressant + mesalamine + steroids
2024-11-06 18:45:53.186787 INFO::Filter data based on min abundance and min prevalence
2024-11-06 18:45:53.191537 INFO::Total samples in data: 102
2024-11-06 18:45:53.195157 INFO::Min samples required with min abundance for a feature not to be filtered: 10.200000
2024-11-06 18:45:53.200236 INFO::Total filtered features: 0
2024-11-06 18:45:53.203883 INFO::Filtered feature names from abundance and prevalence filtering:
2024-11-06 18:45:53.208725 INFO::Total filtered features with variance filtering: 0
2024-11-06 18:45:53.212281 INFO::Filtered feature names from variance filtering:
2024-11-06 18:45:53.215625 INFO::Running selected normalization method: NONE
2024-11-06 18:45:53.219257 INFO::Applying z-score to standardize continuous metadata
2024-11-06 18:45:53.276205 INFO::Running selected transform method: LOG
2024-11-06 18:45:53.286913 INFO::Running selected analysis method: LM
2024-11-06 18:45:53.535234 INFO::Creating cluster of 4 R processes
Error in checkForRemoteErrors(val) :
4 nodes produced errors; first error: object 'current_args' not found
Calls: Maaslin2 ... clusterApply -> staticClusterApply -> checkForRemoteErrors
In addition: Warning message:
In value[[3L]](cond) : double expected, got “FALSE”
Execution halted
/parastor/home/zhangwj02/MetaWIBELE/Res_demo_cha/abundance_annotation/DA/maaslin2_output/demo_proteinfamilies_nrm.feature.split10/all_results.tsv: No such file or directory
I don’t know what caused this. I would be grateful if you could help me.
Wenjin