Could not add metadata to StrainPhlan output

Hi,

I am trying to follow the tutorial (StrainPhlAn3 · biobakery/biobakery Wiki · GitHub) to run StrainPhlan on CHPC but I could not add metadata to strainPhlan output. I’m using Methphlan 3.0. and StrainPhlan 3.0 with container. Any help would be appreciated! Thanks! My script is like:
#######
hmn3=/uufs/chpc.utah.edu/common/home/hcibcore/u0762203/microbiomeTestPipe/containers/qinglhci2019_humann3_metagenome.sif

mkdir -p strainPh

for f in fq/*fq
do
bn=$(basename ${f})
singularity exec $hmn3 metaphlan ${f} --input_type fastq --bowtie2db /uufs/chpc.utah.edu/common/home/hcibcore/u0762203/microbiomeTestPipe/db/bowtie2db/mpa_v30_CHOCOPhlAn_201901 -s strainPh/${bn}.sam.bz2 --bowtie2out strainPh/${bn}.bowtie2.bz2 -o strainPh/${bn}_profile.tsv
done

mkdir -p consensus_markers
singularity exec $hmn3 sample2markers.py -i strainPh/*.sam.bz2 -o consensus_markers -n $NCPU

mkdir -p clade_markers
mkdir -p strainPhOutput
singularity exec $hmn3 extract_markers.py -c s__Eubacterium_rectale -d /uufs/chpc.utah.edu/common/home/hcibcore/u0762203/microbiomeTestPipe/db/bowtie2db/mpa_v30_CHOCOPhlAn_201901/mpa_v30_CHOCOPhlAn_201901.pkl -o clade_markers

mkdir -p strainPhOutput
singularity exec $hmn3 strainphlan -s consensus_markers/*.pkl -m …/clade_markers/s__Eubacterium_rectale.fna -r /uufs/chpc.utah.edu/common/home/hcibcore/u0762203/microbiomeTestPipe/refStrain/Bifidobacterium_breve/GCF_001281425.1_ASM128142v1_genomic.fna -o strainPhOutput -c s__Eubacterium_rectale --phylophlan_mode fast -n $NCPU -d /uufs/chpc.utah.edu/common/home/hcibcore/u0762203/microbiomeTestPipe/db/bowtie2db/mpa_v30_CHOCOPhlAn_201901/mpa_v30_CHOCOPhlAn_201901.pkl

cd strainPhOutput/
singularity exec $hmn3 add_metadata_tree.py -t RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre -f …/…/metadata.txt
singularity exec $hmn3 plot_tree_graphlan.py -t RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre.metadata -m indv --leaf_marker_size 60 --legend_marker_size 60

############

I got error msg like (it looks like some environment variables need to be reset??):

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = “en_US.UTF-8”
are supported and installed on your system.
perl: warning: Falling back to the standard locale (“C”).

[e] “/usr/local/lib/python3.6/dist-packages/phylophlan/phylophlan_configs/” folder does not exists.

But the whole process completed:

Wed Mar 30 15:20:54 2022: Start samples to markers execution
Wed Mar 30 15:20:54 2022: Decompressing samples…
Wed Mar 30 15:21:27 2022: Done.
Wed Mar 30 15:21:27 2022: Converting samples to BAM format…
Wed Mar 30 15:21:58 2022: Done.
Wed Mar 30 15:21:58 2022: Sorting BAM samples…
Wed Mar 30 15:22:36 2022: Done.
Wed Mar 30 15:22:36 2022: Getting consensus markers from samples…
Wed Mar 30 15:22:36 2022: Processing sample: consensus_markers/tmp/15411X4.fq.bam
Wed Mar 30 15:34:17 2022: Done.
Wed Mar 30 15:34:17 2022: Processing sample: consensus_markers/tmp/15411X55.fq.bam
Wed Mar 30 15:45:29 2022: Done.
Wed Mar 30 15:45:29 2022: Processing sample: consensus_markers/tmp/15411X56.fq.bam
Wed Mar 30 15:56:20 2022: Done.
Wed Mar 30 15:56:20 2022: Processing sample: consensus_markers/tmp/15411X3.fq.bam
Wed Mar 30 16:09:03 2022: Done.
Wed Mar 30 16:09:03 2022: Processing sample: consensus_markers/tmp/15411X5.fq.bam
Wed Mar 30 16:23:32 2022: Done.
Wed Mar 30 16:23:32 2022: Processing sample: consensus_markers/tmp/15411X52.fq.bam
Wed Mar 30 16:37:15 2022: Done.
Wed Mar 30 16:37:15 2022: Processing sample: consensus_markers/tmp/15411X54.fq.bam
Wed Mar 30 16:50:44 2022: Done.
Wed Mar 30 16:50:44 2022: Processing sample: consensus_markers/tmp/15411X2.fq.bam
Wed Mar 30 16:55:27 2022: Done.
Wed Mar 30 16:55:27 2022: Processing sample: consensus_markers/tmp/15411X53.fq.bam
Wed Mar 30 17:04:44 2022: Done.
Wed Mar 30 17:04:44 2022: Processing sample: consensus_markers/tmp/15411X50.fq.bam
Wed Mar 30 17:19:14 2022: Done.
Wed Mar 30 17:19:14 2022: Done.
Wed Mar 30 17:19:16 2022: Finish samples to markers execution (7101.99 seconds): Results are stored at “consensus_markers/”
Wed Mar 30 17:19:17 2022: Start StrainPhlAn 3.0 execution
Wed Mar 30 17:19:17 2022: Creating temporary directory…
Wed Mar 30 17:19:17 2022: Done.
Wed Mar 30 17:19:17 2022: Getting markers from main sample files…
Wed Mar 30 17:19:18 2022: Done.
Wed Mar 30 17:19:18 2022: Getting markers from main reference files…
Wed Mar 30 17:19:19 2022: Done.
Wed Mar 30 17:19:19 2022: Removing bad markers / samples…
Wed Mar 30 17:19:19 2022: Done.
Wed Mar 30 17:19:19 2022: Writing samples as markers’ FASTA files…
Wed Mar 30 17:19:19 2022: Done.
Wed Mar 30 17:19:19 2022: Writing filtered clade markers as FASTA file…
Wed Mar 30 17:19:19 2022: Done.
Wed Mar 30 17:19:19 2022: Calculating polymorphic rates…
Wed Mar 30 17:19:19 2022: Done.
Wed Mar 30 17:19:19 2022: Executing PhyloPhlAn 3.0…
Wed Mar 30 17:19:19 2022: Creating PhyloPhlAn 3.0 database…
Wed Mar 30 17:19:20 2022: Done.
Wed Mar 30 17:19:20 2022: Generating PhyloPhlAn 3.0 configuration file…
Wed Mar 30 17:19:20 2022: Done.
Wed Mar 30 17:19:20 2022: Processing samples…
Wed Mar 30 17:19:27 2022: Done.
Wed Mar 30 17:19:27 2022: Done.
Wed Mar 30 17:19:27 2022: Writing information file…
Wed Mar 30 17:19:27 2022: Done.
Wed Mar 30 17:19:27 2022: Removing temporary files…
Wed Mar 30 17:19:29 2022: Done.
Wed Mar 30 17:19:29 2022: Finish StrainPhlAn 3.0 execution (12.08 seconds): Results are stored at “strainPhOutput/”
Input: RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre
number of samples in metadata: 13
Number of samples in tree: 8
Output: RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre.metadata
graphlan_annotate.py --annot RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre.metadata.annot RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre.metadata.graphlantree RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre.metadata.xml
graphlan.py RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre.metadata.xml RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre.metadata.png --dpi 300 --size 8.000000
Output file: RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre.metadata.png

Hi @qingl0331 , thanks for getting in touch.
Have you check whether the sample names in the RAxML_bestTree.s__Eubacterium_rectale.StrainPhlAn3.tre file perfectly correspond to those on your metadata.txt file? Is the first line of the metadata.txt as “SampleID indv” ?

Best,
Aitor