Choosing trim minimum length : would you share advice?

jorondo1 · February 28, 2022, 5:06pm

The reason I start this topic is to discuss if anyone has any reason to choose a certain minlen value for the --trimmomatic-options=“MINLEN:”.

My review of literature revealed that researchers tend to either leave kneaddata on default parameters, or choose a MINLEN without justifying why that exact threshold was chosen.

I am having a hard time deciding the minimum length I want my reads to be. I understand (and expect) that this question would be answered differently depending on sequencing depth, research objectives, assembly or profiling approaches, etc.

Does anyone have any rationale to share, given context? For example, in the case of taxonomic profiling from reads, I would expect we should keep the reads closer to 150 bp (in the case of NovaSeq6000, which we used) and drop very small (20-30 bp long) ones. I feel like the few million 20 bp long reads (post trimming) won’t be much use and may even favour false positives or bias abundance calculations.

Then again, what threshold to choose? 40? 60 ? and what justifies these values ? I feel like researchers ofter just refer to what others have done, but if nobody ever question and measure the impact of such decisions, I think we’re not helping science. It might even be that choosing a minlen of 40 or 80 doesn’t really change anything, but I haven’t found anything supporting that either.

All that being said, I am looking forward to hear your rationales, or maybe sources you could point me to that I have missed (I have not been doing literature reviews for long so it is definitely possible that I might have missed important pieces).

Bernhard · October 22, 2022, 3:53pm

I also have this question. I would appreciate if anyone could share any experience about choosing a minimum read length for trimming.
Thank you!

rohit_satyam · October 22, 2022, 8:07pm

I have been using 75bp as Minimum read length for trimming and rarely goes down to 30bp. Maybe this paper can help you: Pre- and post-sequencing recommendations for functional annotation of human fecal metagenomes | BMC Bioinformatics | Full Text

Topic		Replies	Views
"Clarification on Trimommatic Default Parameters in Kneaddata: Understanding MINLEN Checks for 2x150 bp Data" KneadData	0	203	January 26, 2024
Discard read by minimum length KneadData	2	514	February 28, 2022
MINLEN question KneadData	12	2865	October 21, 2022
Kneaddata MINLEN behaviour unexpected; Adapters remain unremoved KneadData	0	370	March 30, 2022
Kneaddata:Bowtie2 error: skipping reads KneadData	1	1018	June 3, 2021

Choosing trim minimum length : would you share advice?

Related topics