De novo variants (DNVs) cause many genetic diseases. When DNVs are examined in the whole coding regions of genes in next-generation sequencing analyses, pathogenic DNVs often cluster in a specific region. One such region is the last exon and the last 50 bp of the penultimate exon, where truncating DNVs cause escape from nonsense-mediated mRNA decay [NMD(-) region]. Such variants can have dominant-negative or gain-of-function effects. Here, we first developed a resource of rates of truncating DNVs in NMD(-) regions under the null model of DNVs. Utilizing this resource, we performed enrichment analysis of truncating DNVs in NMD(-) regions in 346 developmental and epileptic encephalopathy (DEE) trios. We observed statistically significant enrichment of truncating DNVs in semaphorin 6B (SEMA6B) (p value: 2.8 × 10-8; exome-wide threshold: 2.5 × 10-6). The initial analysis of the 346 individuals and additional screening of 1,406 and 4,293 independent individuals affected by DEE and developmental disorders collectively identified four truncating DNVs in the SEMA6B NMD(-) region in five individuals who came from unrelated families (p value: 1.9 × 10-13) and consistently showed progressive myoclonic epilepsy. RNA analysis of lymphoblastoid cells established from an affected individual showed that the mutant allele escaped NMD, indicating stable production of the truncated protein. Importantly, heterozygous truncating variants in the NMD(+) region of SEMA6B are observed in general populations, and SEMA6B is most likely loss-of-function tolerant. Zebrafish expressing truncating variants in the NMD(-) region of SEMA6B orthologs displayed defective development of brain neurons and enhanced pentylenetetrazole-induced seizure behavior. In summary, we show that truncating DNVs in the final exon of SEMA6B cause progressive myoclonic epilepsy.
Autism spectrum disorder (ASD) is caused by combined genetic and environmental factors. Genetic heritability in ASD is estimated as 60-90%, and genetic investigations have revealed many monogenic factors. We analyzed 405 patients with ASD using family-based exome sequencing to detect disease-causing single-nucleotide variants (SNVs), small insertions and deletions (indels), and copy number variations (CNVs) for molecular diagnoses. All candidate variants were validated by Sanger sequencing or quantitative polymerase chain reaction and were evaluated using the American College of Medical Genetics and Genomics/Association for Molecular Pathology guidelines for molecular diagnosis. We identified 55 disease-causing SNVs/indels in 53 affected individuals and 13 disease-causing CNVs in 13 affected individuals, achieving a molecular diagnosis in 66 of 405 affected individuals (16.3%). Among the 55 disease-causing SNVs/indels, 51 occurred de novo, 2 were compound heterozygous (in one patient), and 2 were X-linked hemizygous variants inherited from unaffected mothers. The molecular diagnosis rate in females was significantly higher than that in males. We analyzed affected sibling cases of 24 quads and 2 quintets, but only one pair of siblings shared an identical pathogenic variant. Notably, there was a higher molecular diagnostic rate in simplex cases than in multiplex families. Our simulation indicated that the diagnostic yield is increasing by 0.63% (range 0-2.5%) per year. Based on our simple simulation, diagnostic yield is improving over time. Thus, periodical reevaluation of ES data should be strongly encouraged in undiagnosed ASD patients.