Hi,
When I run panphlan_profiling.py not all of my samples are showing up in the output file. It gives warnings for 6/16 samples indicating that they haven’t met the threshold, but only 3 samples + a bunch of REF_GCA files show up in the output. What’s happened to the other 7 samples?
(biobakery3) scmb-0aegg7d:For_bbreve s4531769$ panphlan_profiling.py -i /Volumes/SCI/SCMB/Research/Dekker_M/RHDStudents/Sophie_Leech/Honours_project/CMR_New/QC_reads/For_bbreve/map_results --o_matrix /Volumes/SCI/SCMB/Research/Dekker_M/RHDStudents/Sophie_Leech/Honours_project/CMR_New/QC_reads/For_bbreve/result_profile_erectale.tsv -p /Volumes/SCI/SCMB/Research/Dekker_M/RHDStudents/Sophie_Leech/Honours_project/CMR_New/QC_reads/For_bbreve/Bifidobacterium_breve/Bifidobacterium_breve/Bifidobacterium_breve_pangenome.tsv --add_ref
STEP 1. Processing genes informations from pangenome file…
Number of reference genomes: 72
Average number of gene-families per genome: 1775
Total number of pangenome gene-families 7436
STEP 1b. Get genes present in reference genomes…
STEP 2. Create coverage matrix
STEP 3: Strain presence/absence filter based on coverage plateau curve…
041a_eractale.tsv: no strain detected, sample below MIN COVERAGE threshold
081a_eractale.tsv: no strain detected, sample below MIN COVERAGE threshold
143a_eractale.tsv: no strain detected, sample below MIN COVERAGE threshold
172a_eractale.tsv: no strain detected, sample below MIN COVERAGE threshold
181b_eractale.tsv: no strain detected, sample below MIN COVERAGE threshold
BBC3351_eractale.tsv: no strain detected, sample below MIN COVERAGE threshold
STEP 4: Define strain-specific gene-families presence/absence (1,-1,-2,-3 matrix, option --o_idx)
STEP 5: Get presence/absence of gene-families (1,-1 matrix, option --o_matrix)
STEP 5b: Add reference genomes in matrix of presence/absence
STEP 6: Writing presence/absence matrix…
[TERMINATING…] /Users/s4531769/miniconda3/envs/biobakery3/bin/panphlan_profiling.py, 0.17 minutes.
Thank you!