strainphlan -s *json.bz2 --print_clades_only -o ./strainphlan_result --nproc 10 --marker_in_n_samples_perc 20 > clades.txt
Fri May 10 16:59:17 2024: Start StrainPhlAn 4.1.0 execution
Fri May 10 16:59:17 2024: Loading MetaPhlAn mpa_vJun23_CHOCOPhlAnSGB_202307 database…
Fri May 10 16:59:45 2024: Done.
Fri May 10 16:59:57 2024: Processing samples…
Fri May 10 17:00:01 2024: Constructing the big marker matrix
Fri May 10 17:00:02 2024: Checking 742 species
Fri May 10 17:00:03 2024: Done.
Fri May 10 17:00:04 2024: Detected clades:
Fri May 10 17:00:04 2024: t__SGB17237: in 173 samples.
Fri May 10 17:00:04 2024: t__SGB17248: in 95 samples.
Fri May 10 17:00:04 2024: t__SGB8007_group: in 8 samples.
Fri May 10 17:00:04 2024: t__SGB14483: in 5 samples.
Fri May 10 17:00:04 2024: Done.
Fri May 10 17:00:04 2024: Finish StrainPhlAn 4.1.0 execution (47.41 seconds): Results are stored at “./02_strainphlan_result”
I only took 30 metaphlan results as input, and I wondered what “sample” refers to in the appeal text.
And there should be more than 4 common SGB in all my samples.
I then took the next step and executed the command.
strainphlan -s 02_strainphlan_result/*.json.bz2 -m CladeMarkers/t__SGB14483.fna -o Output/t__SGB14483 -c t__SGB14483 -d ./mpa_vJun23_CHOCOPhlAnSGB_202307.pkl --nproc 10 --mutation_rates --phylophlan_mode fast
next error
[Error] The main inputs samples + references are less than 4Fri May 10 17:52:21 2024: Stop StrainPhlAn execution.
The result is unacceptable to me. I’m just looking for a shared strain in my matching mother and baby samples.