Difference between metaphlan merged taxonomic abundance and humann3 joint taxonomic profile

Hi Biobakery team:

When we count taxonomic abundance for our data, we can run metaphlan to get individual sample abundance then **merge_metaplhan_tables"" them to get a final taxa-samples count table.

And there is another way which is embedded in humann3 if we choose to not –remove-temp-output and **merge_metaplhan_tables"" the bugs_list.tsv later, we can also get a taxa-samples count table.

But in the tutorial, we got joint taxonomic profile processed by metaphlan reuslts and re-run humann3.

joint taxonomic profile

A joint taxonomic profile can be created from all of the samples in your set. To create this file and use it for your HUMAnN 3.0 runs, please use the steps that follow.

1. Create taxonomic profiles for each of the samples in your set with [MetaPhlAn](https://bitbucket.org/biobakery/metaphlan)
2. Join all of the taxonomic profiles, located in directory $DIR, into a table of taxonomic profiles for all samples (joined_taxonomic_profile.tsv)
  * `$ humann_join_tables --input $DIR --output joined_taxonomic_profile.tsv`
3. Reduce this file into a taxonomic profile that represents the maximum abundances from all of the samples in your set
  * `$ humann_reduce_table --input joined_taxonomic_profile.tsv --output max_taxonomic_profile.tsv --function max --sort-by level`
4. Run HUMAnN 3.0 on all of the samples in your set, providing the max taxonomic profile
  * for $SAMPLE.fastq in samples
    * `$ humann --input $SAMPLE.fastq --output $OUTPUT_DIR --taxonomic-profile max_taxonomic_profile.tsv`

So why do we need max taxonomic profile to get the taxonomic table in humann3, instead of just merging the individual results?

Thanks

YZ

The procedure described here allows you to define a “superset” of microbial species found across all of your samples (each characterized by its local maximum abundance). This is useful in case you want to map all of your samples against the same (super)set of pangenomes instead of mapping each sample X to the set of species pangenomes detected specifically in X.

This is not a common approach, but it came up often enough that we documented it in the manual and provide the support script to help enable it.

1 Like

Many thanks
Franzosa