We were interested in your study whose data is deposited at GEO at the following location: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE111889
However, there seems to be a discrepancy with the raw counts that have been deposited, with different count values reported between the 2 files listed:
- GSE111889_RAW.tar
- GSE111889_host_tx_counts.tsv.gz
This discrepancy is elaborated in the attached screenshot.
Do you know what may have happened here and which data is correct to use?
My colleague also found some discrepancies in GSE111889_series_matrix.txt , where 4 samples have ’ rectum tissue ’ under ’ Sample_source_name_ch1 '; however, the biopsy location values from ’ Sample_characteristics_ch1 ’ are a different location:
- ‘GSM3043425’: ‘Ileum’
- ‘GSM3043517’: ‘Transverse colon’
- ‘GSM3043535’: ‘Cecum’
- ‘GSM3043564’: ‘Cecum’
Would it be possible to get correct values for either tissue type or biopsy location?
Also, the sample ‘GSM3043543’ does not describe the location from where the biopsy was taken, being instead described as ‘non-inflamed’ having ‘ !Sample_source_name_ch1 ’ as unspecified ‘tissue’. Would it be possible to clarify?