Hello @f.asnicar !!
two questions regarding two different topics:
-
Topic 1: low and high diversity parameter:
as mentioned in the tutorial that high-level diversity is good for tree of life construction, in my case I have 2736 bins and I want to visualize them in tree of life so shall I keep the diversity level high or low? -
Topic 2: difference between kSGBs and uSGBs:
I have done --phylophlan_metagenomic ana;ysis as per your suggestion that has given me taxonomic labels for all the bins. some of them have been assigned as kSGBs and some are uSGBs(unknown). I want to know the difference between the kSGBs and uSGBs. Can I say that the uSGBs are putative novel species? -
Topic 3: How are the markers selected for the phylogenetic tree construction?
I am trying to read the phylophlan paper but I am not able to understand clearly how are the markers selected for the tree construction, are these markers only based on 16srRNA genes, or they are based on all the housekeeping genes?
I have tried to assign the taxonomy to my 2736 bins using GTDBtk database which has used 62k genomes as the reference genomes and on the basis of ANI they have assigned the taxonomy. Does phylophlan also use the reference genomes for selecting the marker genes?
what I have understood is that phylophlan uses the species-specific marker genes for assigning the taxonomic label to each bin and to connect them to other genomes it uses the core marker genes. Please correct me if I am wrong.
How many known species can be classified using the --phylophlan_metagenomic analysis?
I am so thankful for your responses to my questions earlier and also I apologize for asking very silly questions every time.
Thanks