Question on quantifying the LGT events in waafle

Hi~, I have gone through all the steps within waafle and I am wondering how to get the quantified LGT events within one sample. Thanks!

Xinming

I feel like counting the row records in *.lgt.tsv or *.lgt.tsv.qc_pass(after junctions & quality control) should be the solution. But how to do the normalization by the assembled genes? (I used the default reference database in waafle)

The similar question is how to get assembled genes from A and B, describied in paper
"Rates of undirected LGT between a pair of clades (A, B) were defined as the number of A+B events seen across samples normalized to the total number of A and B genes assembled across samples (excluding repeated samples from the same individual and body site).
"

You can do this by counting instances of A and B characters in the gene order column of the lgt and no_lgt files. For example, if you see a AAAAA contig assigned to Bacteroides, that counts as 5 Bacteroides genes.

Could you kindly provide more details on how EVENTS_PER_1K_GENES is calculated? Specifically, does the calculation take into account all genes within the MAGs? I appreciate your time and insights!

We sum up all the genes in a sample contributed by a given taxon (or pair of taxa) from across all contigs that WAAFLE classified: both LGT and non-LGT contigs. So for example, if the contigs are AABB, AAAAAA, and BBB, then there are 13 total A|B genes, 8 A genes, and 5 B genes (for normalization purposes).

Get it! Very thanks !