Hello bioBakery forum,
I tried to use Kneaddata to remove rRNA in my metatranscriptome data and realized rRNA was not matched to SILVA database I downloaded from the kneaddata manual.
After discussion in the bioBakery course with Curtis and did some digging in the manual and tutorial, I found out two potential following issues.
I used the
kneaddata_database --download ribosomal_RNA bowtie2 $DIRto download the SILVA database, it was 11G and downloaded as already bowtie2 built indexes (ends with .bt21). However, the Us were not converted to Ts from the direct download. I go further wanting to convert U and realized we have to convert it in the first step on the SILVA fasta files.
I then found there is a link to SILVA fasta file in the tutorial page
http://huttenhower.sph.harvard.edu/kneadData_databases/SILVA_128_LSUParc_SSUParc_ribosomal_RNA_v0.1.tar.gzI wget the link, but the page was not found.
Then I went to SILVA official website trying to find a database… there were many explanations but no direct download of the database. (I may missed it). I used this one
wget -O "silva-138-99-515-806-nb-classifier.qza" https://data.qiime2.org/2020.6/common/silva-138-99-515-806-nb-classifier.qzabefore for amplicon data, but it is not a fasta file.
I am just wondering could biobakery share an updated link to download SILVA rRNA reference database or direct me to a place to download? Then I can convert Us in the downloaded fasta, then bowtie2 build the SILVA rRNA database, then remove rRNA using KneadData in my metatranscriptome data. Alternatively, a direct download of the already bowtie2 built U converted SILVA rRNA would be great too.
I am also wondering the results only showed adapter removed and human transcriptome removed. The human genome removed did now show. Is that some problem with the bowtie2 build file or metatranscriptome will not match human genome(hg37dec_v0.1.1.bt2 etc)?