Deseq2 analysis of Humann3 outputs

osvatic · March 27, 2023, 8:48am

Hello,

I have recently been working with the humann3 outputs (through biobakery) and would like to use deseq for analysis on the the differential expression of the pathways.

While I imagine there are several ways to do this here is what I have done:

renormalized the humann3 pathabundance.csv to be in CPM
Converted the CPMs to a rough counts matrix by multiplying all CPMs by the “total reads” in their library found in humann_read_and_species_count_table.tsv
Analyzed this count table in deseq2.

Is this a valid approach? Does humann3 do anything to drastically change the data?

Thanks!

franzosa · March 30, 2023, 6:44pm

This procedure should work, but it might raise eyebrows in a paper in that the value of working with counts is mainly wrapped up in working with raw counts (i.e. not performing any sort of upstream normalization on the raw data, and treating 0 counts vs 1 count vs 2 counts as meaningful). One of the steps HUMAnN performs internally is normalizing mapped reads vs. gene length to report RPK units. My understanding is that methods like DEseq opt to directly model the fact that longer genes will have higher counts than shorter genes, so I’m not sure how it will perform with that information already normalized out.

Wanning888 · January 3, 2024, 2:19am

Thank you for sharing this approach! I’m interested in using the counts matrix for additional analysis as well. However, I encountered an issue when attempting to replicate your pipeline: the file humann_read_and_species_count_table.tsv was not found. Could you please guide me on where to locate this file?

osvatic · January 3, 2024, 7:54am

If you are using biobakery, it should be here: ${output_option}/humann/counts/humann_read_and_species_count_table.tsv

I have more recently switched to just using humann3, not through biobakery, so I use the overall size of the library.

Topic		Replies	Views
Deseq2 analysis of Humann3 outputs - clarification HUMAnN	3	98	October 30, 2024
Humann3 to Metacyc Smart table? HUMAnN	1	438	February 16, 2023
Any benchmarking analysis about HUMAnN3? HUMAnN	5	1450	June 4, 2021
Announcing HUMAnN 3.0 (alpha) HUMAnN	2	1973	June 17, 2020
Metatranscriptomics tool HUMAnN	3	1644	April 14, 2021

Deseq2 analysis of Humann3 outputs

Related topics