Hello,
I would like to express my sincere gratitude for your continuous efforts in developing and maintaining bioBakery tools.
Currently, I have been analyzing metagenome data through StrainPhlAn (v4.0.6 with vOct22_CHOCOPhlAnSGB_202212 database), and I have encountered an issue where output tree contains only a few samples.
In an attempt to increase the number of samples, I have been experimenting with parameter adjustments.
Following the guidance provided in previous posts, I have firstly tested adjusting the filtering criteria using --marker_in_n_samples and --sample_with_n_markers within the range of 10-50. Although this resulted in an increase in the number of samples of the output tree, the increase was not substantial.
Then I compared the abundance differences between the filtered and unfiltered samples using MetaPhlAn4 profiling and found that many samples are being filtered out despite having similar abundance levels.
Therefore, I am considering modifying parameters at the sample marker extraction stage (using the sample2markers.py script).
Specifically, I am looking into adjusting the mapping quality (--min_mapping_quality flag) and breadth of coverage (--breadth_threshold flag). However, I am uncertain about the acceptable levels of these adjustments and seek your guidance.
-
The
--breadth_thresholdseems to be particularly important for identifying polymorphism in the marker genes. Is it feasible or permissible to lower this value to around 50? (Based on your experience, what would be an acceptable lower limit?) -
For
--min_mapping_quality,MetaPhlAn4uses a value of 5, whileStrainPhlAnuses 10. Would it be acceptable to lower the mapping quality in StrainPhlAn, for example to 5, as well?
(I understand thatMetaPhlAn4is intended for profiling purposes, whereasStrainPhlAnmay requires more stringent criteria to detect small differences within marker genes.)
I have searched for related discussions but have not found any (I apologize if I missed something). I kindly ask for your understanding if I have misunderstood or overlooked any aspects.
Therefore, I am reaching out for advice.
Thank you again for your invaluable contributions to our field.
Sincerely,