Alternative splicing can expand the diversity of proteomes. Homologous mutually exclusive exons (MXEs) originate from the same ancestral exon and result in polypeptides with similar structural properties but altered sequence. Why would some genes switch homologous exons and what are their biological impact? Here, we analyse the extent of sequence, structural and functional variability in MXEs and report the first large scale, structure-based analysis of the biological impact of MXE events from different genomes. MXE-specific residues tend to map to single domains, are highly enriched in surface exposed residues and cluster at or near protein functional sites. Thus, MXE events are likely to maintain the protein fold, but alter specificity and selectivity of protein function. This comprehensive resource of MXE events and their annotations is available at: http://gene3d.biochem.ucl.ac.uk/mxemod/. These findings highlight how small, but significant changes at critical positions on a protein surface are exploited in evolution to alter function.
Coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 is an ongoing pandemic that causes significant health/socioeconomic burden. Variants of concern (VOCs) have emerged affecting transmissibility, disease severity and re-infection risk. Studies suggest that the - N-terminal domain (NTD) of the spike protein may have a role in facilitating virus entry via sialic-acid receptor binding. Furthermore, most VOCs include novel NTD variants. Despite global sequence and structure similarity, most sialic-acid binding pockets in NTD vary across coronaviruses. Our work suggests ongoing evolutionary tuning of the sugar-binding pockets and recent analyses have shown that NTD insertions in VOCs tend to lie close to loops. We extended the structural characterisation of these sugar-binding pockets and explored whether variants could enhance sialic acid-binding. We found that recent NTD insertions in VOCs (i.e., Gamma, Delta and Omicron variants) and emerging variants of interest (VOIs) (i.e., Iota, Lambda and Theta variants) frequently lie close to sugar-binding pockets. For some variants, including the recent Omicron VOC, we find increases in predicted sialic acid-binding energy, compared to the original SARS-CoV-2, which may contribute to increased transmission. These binding observations are supported by molecular dynamics simulations (MD). We examined the similarity of NTD across Betacoronaviruses to determine whether the sugar-binding pockets are sufficiently similar to be exploited in drug design. Whilst most pockets are too structurally variable, we detected a previously unknown highly structurally conserved pocket which can be investigated in pursuit of a generic pan-Betacoronavirus drug. Our structure-based analyses help rationalise the effects of VOCs and provide hypotheses for experiments. Our findings suggest a strong need for experimental monitoring of changes in NTD of VOCs.
Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of globular domain annotations for millions of available protein sequences. Gene3D has previously featured in the Database issue of NAR and here we report a significant update to the Gene3D database. The current release, Gene3D v16, has significantly expanded its domain coverage over the previous version and now contains over 95 million domain assignments. We also report a new method for dealing with complex domain architectures that exist in Gene3D, arising from discontinuous domains. Amongst other updates, we have added visualization tools for exploring domain annotations in the context of other sequence features and in gene families. We also provide web-pages to visualize other domain families that co-occur with a given query domain family.
In order to implement the fifth generation (5G) communication system for a large number of users, the governments of many countries nominated the low 5G frequency band between 3.3 and 4.3 GHz. This paper proposes a wideband RFPA by designing the input matching network (MN) and output MN of the device using the simplified real frequency technique (SRFT) and the harmonic tuning network. The load-pull and source-pull is applied at multiple points for 100 MHz intervals over the bandwidth to obtain the optimum impedances at the output and input of the 10W Gallium Nitride (GaN) Cree CGH40010F device. To verify the design, the RFPA is simulated, and the performance is measured between 3.3 and 4.3 GHz. According to experimental findings, the measured drain efficiency (DE) throughout the whole bandwidth ranged from 57.5 to 67.5% at the output power of 40 dBm. Moreover, at the 1 dB compression point between 39.2 and 42.2 dBm output power, the drain efficiency (DE) achieves a high value of 81.2% with an output power of 42.2 dBm at a frequency of 3.3 GHz. The RFPA can obtain a maximum gain of 12.4 dB at 3.5 GHz. The linearity of the RFPA with a two-tone signal is measured and the value is less than -22 dBc all over the band.
CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.
SARS-CoV-2 has a zoonotic origin and was transmitted to humans via an undetermined intermediate host, leading to infections in humans and other mammals. To enter host cells, the viral spike protein (S-protein) binds to its receptor, ACE2, and is then processed by TMPRSS2. Whilst receptor binding contributes to the viral host range, S-protein:ACE2 complexes from other animals have not been investigated widely. To predict infection risks, we modelled S-protein:ACE2 complexes from 215 vertebrate species, calculated changes in the energy of the complex caused by mutations in each species, relative to human ACE2, and correlated these changes with COVID-19 infection data. We also analysed structural interactions to better understand the key residues contributing to affinity. We predict that mutations are more detrimental in ACE2 than TMPRSS2. Finally, we demonstrate phylogenetically that human SARS-CoV-2 strains have been isolated in animals. Our results suggest that SARS-CoV-2 can infect a broad range of mammals, but few fish, birds or reptiles. Susceptible animals could serve as reservoirs of the virus, necessitating careful ongoing animal management and surveillance.
We present first evidence that the cosine of the CP-violating weak phase 2β is positive, and hence exclude trigonometric multifold solutions of the Cabibbo-Kobayashi-Maskawa (CKM) Unitarity Triangle using a time-dependent Dalitz plot analysis of B^{0}→D^{(*)}h^{0} with D→K_{S}^{0}π^{+}π^{-} decays, where h^{0}∈{π^{0},η,ω} denotes a light unflavored and neutral hadron. The measurement is performed combining the final data sets of the BABAR and Belle experiments collected at the ϒ(4S) resonance at the asymmetric-energy B factories PEP-II at SLAC and KEKB at KEK, respectively. The data samples contain (471±3)×10^{6}BB[over ¯] pairs recorded by the BABAR detector and (772±11)×10^{6}BB[over ¯] pairs recorded by the Belle detector. The results of the measurement are sin2β=0.80±0.14(stat)±0.06(syst)±0.03(model) and cos2β=0.91±0.22(stat)±0.09(syst)±0.07(model). The result for the direct measurement of the angle β of the CKM Unitarity Triangle is β=[22.5±4.4(stat)±1.2(syst)±0.6(model)]°. The measurement assumes no direct CP violation in B^{0}→D^{(*)}h^{0} decays. The quoted model uncertainties are due to the composition of the D^{0}→K_{S}^{0}π^{+}π^{-} decay amplitude model, which is newly established by performing a Dalitz plot amplitude analysis using a high-statistics e^{+}e^{-}→cc[over ¯] data sample. CP violation is observed in B^{0}→D^{(*)}h^{0} decays at the level of 5.1 standard deviations. The significance for cos2β>0 is 3.7 standard deviations. The trigonometric multifold solution π/2-β=(68.1±0.7)° is excluded at the level of 7.3 standard deviations. The measurement resolves an ambiguity in the determination of the apex of the CKM Unitarity Triangle.