Hello,
I would like to express my sincere gratitude for your continuous efforts in developing and maintaining bioBakery tools.
Currently, I have been analyzing metagenome data through StrainPhlAn
(v4.0.6 with vOct22_CHOCOPhlAnSGB_202212
database), and I have encountered an issue where output tree contains only a few samples.
In an attempt to increase the number of samples, I have been experimenting with parameter adjustments.
Following the guidance provided in previous posts, I have firstly tested adjusting the filtering criteria using --marker_in_n_samples
and --sample_with_n_markers
within the range of 10-50. Although this resulted in an increase in the number of samples of the output tree, the increase was not substantial.
Then I compared the abundance differences between the filtered and unfiltered samples using MetaPhlAn4
profiling and found that many samples are being filtered out despite having similar abundance levels.
Therefore, I am considering modifying parameters at the sample marker extraction stage (using the sample2markers.py
script).
Specifically, I am looking into adjusting the mapping quality (--min_mapping_quality
flag) and breadth of coverage (--breadth_threshold
flag). However, I am uncertain about the acceptable levels of these adjustments and seek your guidance.
-
The
--breadth_threshold
seems to be particularly important for identifying polymorphism in the marker genes. Is it feasible or permissible to lower this value to around 50? (Based on your experience, what would be an acceptable lower limit?) -
For
--min_mapping_quality
,MetaPhlAn4
uses a value of 5, whileStrainPhlAn
uses 10. Would it be acceptable to lower the mapping quality in StrainPhlAn, for example to 5, as well?
(I understand thatMetaPhlAn4
is intended for profiling purposes, whereasStrainPhlAn
may requires more stringent criteria to detect small differences within marker genes.)
I have searched for related discussions but have not found any (I apologize if I missed something). I kindly ask for your understanding if I have misunderstood or overlooked any aspects.
Therefore, I am reaching out for advice.
Thank you again for your invaluable contributions to our field.
Sincerely,