Announcing MetaPhlAn 3.1

MetaPhlAn 3.1 / HUMAnN 3.1 Release Notes

MetaPhlAn 3.1 and HUMAnN 3.1 represent a moderate update to the bioBakery 3 software and databases. This update is based on improvements in our ability to align NCBI-sourced microbial genomes and their constituent genes to UniProt resources alongside removal of additional low-quality species (see PMID:33944776 for the definition of “low-quality species”).

What has changed in v3.1:

  • MetaPhlAn 3.1: 2,680 species were added to the marker database and 433 low-quality species were removed (new total = 15,766: a 17% increase).

  • MetaPhlAn 3.1: Marker genes for a minority subset of existing bioBakery 3 species were also revised.

  • HUMAnN 3.1: 2,132 species were added to the pangenome database and 645 low-quality species were removed (new total = 12,773: a 13% increase).

  • HUMAnN 3.1: Pangenomes of most existing bioBakery 3 species were updated with revised or expanded gene content.

  • Minor software updates.

What has NOT changed in v3.1:

  • UniProt-derived resources included with bioBakery 3 (e.g. the DIAMOND-formatted UniRef90/50 databases) have not changed and do not need to be updated.

  • MetaPhlAn 3.1 and HUMAnN 3.1 remain focused on microbial isolate genomes sourced from NCBI. See the section “What’s next for bioBakery?” at the end of this document for updates on profiling species defined or expanded from metagenome-assembled genomes (MAGs).

How to perform a fresh install of MetaPhlAn 3.1 and HUMAnN 3.1:

How to upgrade from earlier versions of MetaPhlAn 3 and HUMAnN 3:

  • Update HUMAnN with pip:

    • $ pip install humann –upgrade
  • Download the new MetaPhlAn 3.1 marker database:

    • $ metaphlan --force_download
  • Download the new HUMAnN 3.1 pangenome database (full_chocophlan.v201901_v31.tar.gz) to a NEW folder:

    • $ humann_databases --download chocophlan full /path/to/new_folder

    • NOTE: Mixing v3.1 pangenomes with earlier pangenomes in the same folder will raise an error.

  • You DO NOT need to re-download the HUMAnN 3.1 DIAMOND-formatted UniRef90/50 databases or the accessory mapping files.

    • If needed, you can point your HUMAnN 3.1 installation to the locations of these files using the humann_config script.

What’s next for bioBakery?

  • The next generation of MetaPhlAn (v4) will support profiling the abundances of species genome bins (SGBs) as derived from a combination of isolate genomes and metagenome-assembled genomes in PMID:30661755.

  • MetaPhlAn v4 will be released soon alongside an update to HUMAnN (v3.5) that will enable users to combine SGB-based taxonomic profiles with the current (v3.1) pangenomes.

  • HUMAnN v4 (enabling functional profiling from SGBs) is also in development, but will release later.

If there is still an opportunity to make a feature request for MetaPhlAn 4, I wonder if it could output a gene expression table of all identified species’ genes for further differential gene expression testing. That could be useful if the input data was metatranscriptomics data and the topic of interest is differentially-expressed bacterial genes.