RESULTS: We show that SYMRK is essential for nodulation and endomycorrhization in Parasponia andersonii. Subsequently, it is revealed that the 5'-intron donor splice site of SYMRK intron 12 is variable and, in most dicotyledon species, doesn't contain the canonical dinucleotide 'GT' signature but the much less common motif 'GC'. Strikingly, in T. orientalis, this motif is converted into a rare non-canonical 5'-intron donor splice site 'GA'. This SYMRK allele, however, is fully functional and spreads in the T. orientalis population of Malaysian Borneo. A further investigation into the occurrence of the non-canonical GA-AG splice sites confirmed that these are extremely rare.
CONCLUSION: SYMRK functioning is highly conserved in legumes, actinorhizal plants, and Parasponia. The gene possesses a non-common 5'-intron GC donor splice site in intron 12, which is converted into a GA in T. orientalis accessions of Malaysian Borneo. The discovery of this functional GA-AG splice site in SYMRK highlights a gap in our understanding of splice donor sites.
RESULTS: Here, we present draft genome information for five agriculturally, biologically, medicinally, and economically important underutilized plants native to Africa: Vigna subterranea, Lablab purpureus, Faidherbia albida, Sclerocarya birrea, and Moringa oleifera. Assembled genomes range in size from 217 to 654 Mb. In V. subterranea, L. purpureus, F. albida, S. birrea, and M. oleifera, we have predicted 31,707, 20,946, 28,979, 18,937, and 18,451 protein-coding genes, respectively. By further analyzing the expansion and contraction of selected gene families, we have characterized root nodule symbiosis genes, transcription factors, and starch biosynthesis-related genes in these genomes.
CONCLUSIONS: These genome data will be useful to identify and characterize agronomically important genes and understand their modes of action, enabling genomics-based, evolutionary studies, and breeding strategies to design faster, more focused, and predictable crop improvement programs.