MyMedR

Displaying publications 1 - 20 of 39 in total

Abstract:

Sort:

Fulltext Global monitoring of antimicrobial resistance based on metagenomics analyses of urban sewage

Hendriksen RS, Munk P, Njage P, van Bunnik B, McNally L, Lukjancenko O, et al.

Nat Commun, 2019 03 08;10(1):1124.
PMID: 30850636 DOI: 10.1038/s41467-019-08853-3

Antimicrobial resistance (AMR) is a serious threat to global public health, but obtaining representative data on AMR for healthy human populations is difficult. Here, we use metagenomic analysis of untreated sewage to characterize the bacterial resistome from 79 sites in 60 countries. We find systematic differences in abundance and diversity of AMR genes between Europe/North-America/Oceania and Africa/Asia/South-America. Antimicrobial use data and bacterial taxonomy only explains a minor part of the AMR variation that we observe. We find no evidence for cross-selection between antimicrobial classes, or for effect of air travel between sites. However, AMR gene abundance strongly correlates with socio-economic, health and environmental factors, which we use to predict AMR gene abundances in all countries in the world. Our findings suggest that global AMR gene diversity and abundance vary by region, and that improving sanitation and health could potentially limit the global burden of AMR. We propose metagenomic analysis of sewage as an ethically acceptable and economically feasible approach for continuous global surveillance and prediction of AMR.
Fulltext Antibodies to Intercellular Adhesion Molecule 1-Binding Plasmodium falciparum Erythrocyte Membrane Protein 1-DBLβ Are Biomarkers of Protective Immunity to Malaria in a Cohort of Young Children from Papua New Guinea

Tessema SK, Utama D, Chesnokov O, Hodder AN, Lin CS, Harrison GLA, et al.

Infect Immun, 2018 08;86(8).
PMID: 29784862 DOI: 10.1128/IAI.00485-17

Plasmodium falciparum erythrocyte membrane protein 1 (PfEMP1) mediates parasite sequestration to the cerebral microvasculature via binding of DBLβ domains to intercellular adhesion molecule 1 (ICAM1) and is associated with severe cerebral malaria. In a cohort of 187 young children from Papua New Guinea (PNG), we examined baseline levels of antibody to the ICAM1-binding PfEMP1 domain, DBLβ3PF11_0521, in comparison to four control antigens, including NTS-DBLα and CIDR1 domains from another group A variant and a group B/C variant. Antibody levels for the group A antigens were strongly associated with age and exposure. Antibody responses to DBLβ3PF11_0521 were associated with a 37% reduced risk of high-density clinical malaria in the follow-up period (adjusted incidence risk ratio [aIRR] = 0.63 [95% confidence interval {CI}, 0.45 to 0.88; P = 0.007]) and a 25% reduction in risk of low-density clinical malaria (aIRR = 0.75 [95% CI, 0.55 to 1.01; P = 0.06]), while there was no such association for other variants. Children who experienced severe malaria also had significantly lower levels of antibody to DBLβ3PF11_0521 and the other group A domains than those that experienced nonsevere malaria. Furthermore, a subset of PNG DBLβ sequences had ICAM1-binding motifs, formed a distinct phylogenetic cluster, and were similar to sequences from other areas of endemicity. PfEMP1 variants associated with these DBLβ domains were enriched for DC4 and DC13 head structures implicated in endothelial protein C receptor (EPCR) binding and severe malaria, suggesting conservation of dual binding specificities. These results provide further support for the development of specific classes of PfEMP1 as vaccine candidates and as biomarkers for protective immunity against clinical P. falciparum malaria.
Kouprey (Bos sauveli) genomes unveil polytomic origin of wild Asian Bos

Sinding MS, Ciucani MM, Ramos-Madrigal J, Carmagnini A, Rasmussen JA, Feng S, et al.

iScience, 2021 Nov 19;24(11):103226.
PMID: 34712923 DOI: 10.1016/j.isci.2021.103226

The evolution of the genera Bos and Bison, and the nature of gene flow between wild and domestic species, is poorly understood, with genomic data of wild species being limited. We generated two genomes from the likely extinct kouprey (Bos sauveli) and analyzed them alongside other Bos and Bison genomes. We found that B. sauveli possessed genomic signatures characteristic of an independent species closely related to Bos javanicus and Bos gaurus. We found evidence for extensive incomplete lineage sorting across the three species, consistent with a polytomic diversification of the major ancestry in the group, potentially followed by secondary gene flow. Finally, we detected significant gene flow from an unsampled Asian Bos-like source into East Asian zebu cattle, demonstrating both that the full genomic diversity and evolutionary history of the Bos complex has yet to be elucidated and that museum specimens and ancient DNA are valuable resources to do so.
Fulltext "Out of the Can": A Draft Genome Assembly, Liver Transcriptome, and Nutrigenomics of the European Sardine, Sardina pilchardus

Machado AM, Tørresen OK, Kabeya N, Couto A, Petersen B, Felício M, et al.

Genes (Basel), 2018 Oct 09;9(10).
PMID: 30304855 DOI: 10.3390/genes9100485

Clupeiformes, such as sardines and herrings, represent an important share of worldwide fisheries. Among those, the European sardine (Sardina pilchardus, Walbaum 1792) exhibits significant commercial relevance. While the last decade showed a steady and sharp decline in capture levels, recent advances in culture husbandry represent promising research avenues. Yet, the complete absence of genomic resources from sardine imposes a severe bottleneck to understand its physiological and ecological requirements. We generated 69 Gbp of paired-end reads using Illumina HiSeq X Ten and assembled a draft genome assembly with an N50 scaffold length of 25,579 bp and BUSCO completeness of 82.1% (Actinopterygii). The estimated size of the genome ranges between 655 and 850 Mb. Additionally, we generated a relatively high-level liver transcriptome. To deliver a proof of principle of the value of this dataset, we established the presence and function of enzymes (Elovl2, Elovl5, and Fads2) that have pivotal roles in the biosynthesis of long chain polyunsaturated fatty acids, essential nutrients particularly abundant in oily fish such as sardines. Our study provides the first omics dataset from a valuable economic marine teleost species, the European sardine, representing an essential resource for their effective conservation, management, and sustainable exploitation.
Evidence for the Higgs Boson Decay to a Z Boson and a Photon at the LHC

Aad G, Abbott B, Abeling K, Abicht NJ, Abidi SH, Aboulhorma A, et al.

Phys Rev Lett, 2024 Jan 12;132(2):021803.
PMID: 38277607 DOI: 10.1103/PhysRevLett.132.021803

The first evidence for the Higgs boson decay to a Z boson and a photon is presented, with a statistical significance of 3.4 standard deviations. The result is derived from a combined analysis of the searches performed by the ATLAS and CMS Collaborations with proton-proton collision datasets collected at the CERN Large Hadron Collider (LHC) from 2015 to 2018. These correspond to integrated luminosities of around 140 fb^{-1} for each experiment, at a center-of-mass energy of 13 TeV. The measured signal yield is 2.2±0.7 times the standard model prediction, and agrees with the theoretical expectation within 1.9 standard deviations.
Fulltext Arctic-adapted dogs emerged at the Pleistocene-Holocene transition

Sinding MS, Gopalakrishnan S, Ramos-Madrigal J, de Manuel M, Pitulko VV, Kuderna L, et al.

Science, 2020 06 26;368(6498):1495-1499.
PMID: 32587022 DOI: 10.1126/science.aaz8599

Although sled dogs are one of the most specialized groups of dogs, their origin and evolution has received much less attention than many other dog groups. We applied a genomic approach to investigate their spatiotemporal emergence by sequencing the genomes of 10 modern Greenland sled dogs, an ~9500-year-old Siberian dog associated with archaeological evidence for sled technology, and an ~33,000-year-old Siberian wolf. We found noteworthy genetic similarity between the ancient dog and modern sled dogs. We detected gene flow from Pleistocene Siberian wolves, but not modern American wolves, to present-day sled dogs. The results indicate that the major ancestry of modern sled dogs traces back to Siberia, where sled dog-specific haplotypes of genes that potentially relate to Arctic adaptation were established by 9500 years ago.
Fulltext Interspecific Gene Flow Shaped the Evolution of the Genus Canis

Gopalakrishnan S, Sinding MS, Ramos-Madrigal J, Niemann J, Samaniego Castruita JA, Vieira FG, et al.

Curr Biol, 2018 11 05;28(21):3441-3449.e5.
PMID: 30344120 DOI: 10.1016/j.cub.2018.08.041

The evolutionary history of the wolf-like canids of the genus Canis has been heavily debated, especially regarding the number of distinct species and their relationships at the population and species level [1-6]. We assembled a dataset of 48 resequenced genomes spanning all members of the genus Canis except the black-backed and side-striped jackals, encompassing the global diversity of seven extant canid lineages. This includes eight new genomes, including the first resequenced Ethiopian wolf (Canis simensis), one dhole (Cuon alpinus), two East African hunting dogs (Lycaon pictus), two Eurasian golden jackals (Canis aureus), and two Middle Eastern gray wolves (Canis lupus). The relationships between the Ethiopian wolf, African golden wolf, and golden jackal were resolved. We highlight the role of interspecific hybridization in the evolution of this charismatic group. Specifically, we find gene flow between the ancestors of the dhole and African hunting dog and admixture between the gray wolf, coyote (Canis latrans), golden jackal, and African golden wolf. Additionally, we report gene flow from gray and Ethiopian wolves to the African golden wolf, suggesting that the African golden wolf originated through hybridization between these species. Finally, we hypothesize that coyotes and gray wolves carry genetic material derived from a "ghost" basal canid lineage.
Fulltext Population genomics of grey wolves and wolf-like canids in North America

Sinding MS, Gopalakrishan S, Vieira FG, Samaniego Castruita JA, Raundrup K, Heide Jørgensen MP, et al.

PLoS Genet, 2018 11;14(11):e1007745.
PMID: 30419012 DOI: 10.1371/journal.pgen.1007745

North America is currently home to a number of grey wolf (Canis lupus) and wolf-like canid populations, including the coyote (Canis latrans) and the taxonomically controversial red, Eastern timber and Great Lakes wolves. We explored their population structure and regional gene flow using a dataset of 40 full genome sequences that represent the extant diversity of North American wolves and wolf-like canid populations. This included 15 new genomes (13 North American grey wolves, 1 red wolf and 1 Eastern timber/Great Lakes wolf), ranging from 0.4 to 15x coverage. In addition to providing full genome support for the previously proposed coyote-wolf admixture origin for the taxonomically controversial red, Eastern timber and Great Lakes wolves, the discriminatory power offered by our dataset suggests all North American grey wolves, including the Mexican form, are monophyletic, and thus share a common ancestor to the exclusion of all other wolves. Furthermore, we identify three distinct populations in the high arctic, one being a previously unidentified "Polar wolf" population endemic to Ellesmere Island and Greenland. Genetic diversity analyses reveal particularly high inbreeding and low heterozygosity in these Polar wolves, consistent with long-term isolation from the other North American wolves.
Fulltext The population genomic legacy of the second plague pandemic

Gopalakrishnan S, Ebenesersdóttir SS, Lundstrøm IKC, Turner-Walker G, Moore KHS, Luisi P, et al.

Curr Biol, 2022 Nov 07;32(21):4743-4751.e6.
PMID: 36182700 DOI: 10.1016/j.cub.2022.09.023

Human populations have been shaped by catastrophes that may have left long-lasting signatures in their genomes. One notable example is the second plague pandemic that entered Europe in ca. 1,347 CE and repeatedly returned for over 300 years, with typical village and town mortality estimated at 10%-40%.1 It is assumed that this high mortality affected the gene pools of these populations. First, local population crashes reduced genetic diversity. Second, a change in frequency is expected for sequence variants that may have affected survival or susceptibility to the etiologic agent (Yersinia pestis).2 Third, mass mortality might alter the local gene pools through its impact on subsequent migration patterns. We explored these factors using the Norwegian city of Trondheim as a model, by sequencing 54 genomes spanning three time periods: (1) prior to the plague striking Trondheim in 1,349 CE, (2) the 17th-19th century, and (3) the present. We find that the pandemic period shaped the gene pool by reducing long distance immigration, in particular from the British Isles, and inducing a bottleneck that reduced genetic diversity. Although we also observe an excess of large FST values at multiple loci in the genome, these are shaped by reference biases introduced by mapping our relatively low genome coverage degraded DNA to the reference genome. This implies that attempts to detect selection using ancient DNA (aDNA) datasets that vary by read length and depth of sequencing coverage may be particularly challenging until methods have been developed to account for the impact of differential reference bias on test statistics.
Fulltext A draft genome sequence of the elusive giant squid, Architeuthis dux

da Fonseca RR, Couto A, Machado AM, Brejova B, Albertin CB, Silva F, et al.

Gigascience, 2020 Jan 01;9(1).
PMID: 31942620 DOI: 10.1093/gigascience/giz152

BACKGROUND: The giant squid (Architeuthis dux; Steenstrup, 1857) is an enigmatic giant mollusc with a circumglobal distribution in the deep ocean, except in the high Arctic and Antarctic waters. The elusiveness of the species makes it difficult to study. Thus, having a genome assembled for this deep-sea-dwelling species will allow several pending evolutionary questions to be unlocked.
FINDINGS: We present a draft genome assembly that includes 200 Gb of Illumina reads, 4 Gb of Moleculo synthetic long reads, and 108 Gb of Chicago libraries, with a final size matching the estimated genome size of 2.7 Gb, and a scaffold N50 of 4.8 Mb. We also present an alternative assembly including 27 Gb raw reads generated using the Pacific Biosciences platform. In addition, we sequenced the proteome of the same individual and RNA from 3 different tissue types from 3 other species of squid (Onychoteuthis banksii, Dosidicus gigas, and Sthenoteuthis oualaniensis) to assist genome annotation. We annotated 33,406 protein-coding genes supported by evidence, and the genome completeness estimated by BUSCO reached 92%. Repetitive regions cover 49.17% of the genome.
CONCLUSIONS: This annotated draft genome of A. dux provides a critical resource to investigate the unique traits of this species, including its gigantism and key adaptations to deep-sea environments.
Fulltext The Tetragnatha kauaiensis Genome Sheds Light on the Origins of Genomic Novelty in Spiders

Cerca J, Armstrong EE, Vizueta J, Fernández R, Dimitrov D, Petersen B, et al.

Genome Biol Evol, 2021 Dec 01;13(12).
PMID: 34849853 DOI: 10.1093/gbe/evab262

Spiders (Araneae) have a diverse spectrum of morphologies, behaviors, and physiologies. Attempts to understand the genomic-basis of this diversity are often hindered by their large, heterozygous, and AT-rich genomes with high repeat content resulting in highly fragmented, poor-quality assemblies. As a result, the key attributes of spider genomes, including gene family evolution, repeat content, and gene function, remain poorly understood. Here, we used Illumina and Dovetail Chicago technologies to sequence the genome of the long-jawed spider Tetragnatha kauaiensis, producing an assembly distributed along 3,925 scaffolds with an N50 of ∼2 Mb. Using comparative genomics tools, we explore genome evolution across available spider assemblies. Our findings suggest that the previously reported and vast genome size variation in spiders is linked to the different representation and number of transposable elements. Using statistical tools to uncover gene-family level evolution, we find expansions associated with the sensory perception of taste, immunity, and metabolism. In addition, we report strikingly different histories of chemosensory, venom, and silk gene families, with the first two evolving much earlier, affected by the ancestral whole genome duplication in Arachnopulmonata (∼450 Ma) and exhibiting higher numbers. Together, our findings reveal that spider genomes are highly variable and that genomic novelty may have been driven by the burst of an ancient whole genome duplication, followed by gene family and transposable element expansion.
Fulltext Genomes of Pleistocene Siberian Wolves Uncover Multiple Extinct Wolf Lineages

Ramos-Madrigal J, Sinding MS, Carøe C, Mak SST, Niemann J, Samaniego Castruita JA, et al.

Curr Biol, 2021 01 11;31(1):198-206.e8.
PMID: 33125870 DOI: 10.1016/j.cub.2020.10.002

Extant Canis lupus genetic diversity can be grouped into three phylogenetically distinct clades: Eurasian and American wolves and domestic dogs.1 Genetic studies have suggested these groups trace their origins to a wolf population that expanded during the last glacial maximum (LGM)1-3 and replaced local wolf populations.4 Moreover, ancient genomes from the Yana basin and the Taimyr peninsula provided evidence of at least one extinct wolf lineage that dwelled in Siberia during the Pleistocene.35 Previous studies have suggested that Pleistocene Siberian canids can be classified into two groups based on cranial morphology. Wolves in the first group are most similar to present-day populations, although those in the second group possess intermediate features between dogs and wolves.67 However, whether this morphological classification represents distinct genetic groups remains unknown. To investigate this question and the relationships between Pleistocene canids, present-day wolves, and dogs, we resequenced the genomes of four Pleistocene canids from Northeast Siberia dated between >50 and 14 ka old, including samples from the two morphological categories. We found these specimens cluster with the two previously sequenced Pleistocene wolves, which are genetically more similar to Eurasian wolves. Our results show that, though the four specimens represent extinct wolf lineages, they do not form a monophyletic group. Instead, each Pleistocene Siberian canid branched off the lineage that gave rise to present-day wolves and dogs. Finally, our results suggest the two previously described morphological groups could represent independent lineages similarly related to present-day wolves and dogs.
MobiSeq: De novo SNP discovery in model and non-model species through sequencing the flanking region of transposable elements

Rey-Iglesia A, Gopalakrishan S, Carøe C, Alquezar-Planas DE, Ahlmann Nielsen A, Röder T, et al.

Mol Ecol Resour, 2019 Mar;19(2):512-525.
PMID: 30575257 DOI: 10.1111/1755-0998.12984

In recent years, the availability of reduced representation library (RRL) methods has catalysed an expansion of genome-scale studies to characterize both model and non-model organisms. Most of these methods rely on the use of restriction enzymes to obtain DNA sequences at a genome-wide level. These approaches have been widely used to sequence thousands of markers across individuals for many organisms at a reasonable cost, revolutionizing the field of population genomics. However, there are still some limitations associated with these methods, in particular the high molecular weight DNA required as starting material, the reduced number of common loci among investigated samples, and the short length of the sequenced site-associated DNA. Here, we present MobiSeq, a RRL protocol exploiting simple laboratory techniques, that generates genomic data based on PCR targeted enrichment of transposable elements and the sequencing of the associated flanking region. We validate its performance across 103 DNA extracts derived from three mammalian species: grey wolf (Canis lupus), red deer complex (Cervus sp.) and brown rat (Rattus norvegicus). MobiSeq enables the sequencing of hundreds of thousands loci across the genome and performs SNP discovery with relatively low rates of clonality. Given the ease and flexibility of MobiSeq protocol, the method has the potential to be implemented for marker discovery and population genomics across a wide range of organisms-enabling the exploration of diverse evolutionary and conservation questions.
Fulltext The Genome and mRNA Transcriptome of the Cosmopolitan Calanoid Copepod Acartia tonsa Dana Improve the Understanding of Copepod Genome Size Evolution

Jørgensen TS, Petersen B, Petersen HCB, Browne PD, Prost S, Stillman JH, et al.

Genome Biol Evol, 2019 May 01;11(5):1440-1450.
PMID: 30918947 DOI: 10.1093/gbe/evz067

Members of the crustacean subclass Copepoda are likely the most abundant metazoans worldwide. Pelagic marine species are critical in converting planktonic microalgae to animal biomass, supporting oceanic food webs. Despite their abundance and ecological importance, only six copepod genomes are publicly available, owing to a number of factors including large genome size, repetitiveness, GC-content, and small animal size. Here, we report the seventh representative copepod genome and the first genome and the first transcriptome from the calanoid copepod species Acartia tonsa Dana, which is among the most numerous mesozooplankton in boreal coastal and estuarine waters. The ecology, physiology, and behavior of A. tonsa have been studied extensively. The genetic resources contributed in this work will allow researchers to link experimental results to molecular mechanisms. From PCR-free whole genome sequence and mRNA Illumina data, we assemble the largest copepod genome to date. We estimate that A. tonsa has a total genome size of 2.5 Gb including repetitive elements we could not resolve. The nonrepetitive fraction of the genome assembly is estimated to be 566 Mb. Our DNA sequencing-based analyses suggest there is a 14-fold difference in genome size between the six members of Copepoda with available genomic information. This finding complements nucleus staining genome size estimations, where 100-fold difference has been reported within 70 species. We briefly analyze the repeat structure in the existing copepod whole genome sequence data sets. The information presented here confirms the evolution of genome size in Copepoda and expands the scope for evolutionary inferences in Copepoda by providing several levels of genetic information from a key planktonic crustacean species.
Fulltext The Whole Genome Sequence and mRNA Transcriptome of the Tropical Cyclopoid Copepod Apocyclops royi

Jørgensen TS, Nielsen BLH, Petersen B, Browne PD, Hansen BW, Hansen LH

G3 (Bethesda), 2019 05 07;9(5):1295-1302.
PMID: 30923136 DOI: 10.1534/g3.119.400085

Copepoda is one of the most ecologically important animal groups on Earth, yet very few genetic resources are available for this Subclass. Here, we present the first whole genome sequence (WGS, acc. UYDY01) and the first mRNA transcriptome assembly (TSA, Acc. GHAJ01) for the tropical cyclopoid copepod species Apocyclops royi Until now, only the 18S small subunit of ribosomal RNA gene and the COI gene has been available from A. royi, and WGS resources was only available from one other cyclopoid copepod species. Overall, the provided resources are the 8th copepod species to have WGS resources available and the 19th copepod species with TSA information available. We analyze the length and GC content of the provided WGS scaffolds as well as the coverage and gene content of both the WGS and the TSA assembly. Finally, we place the resources within the copepod order Cyclopoida as a member of the Apocyclops genus. We estimate the total genome size of A. royi to 450 Mb, with 181 Mb assembled nonrepetitive sequence, 76 Mb assembled repeats and 193 Mb unassembled sequence. The TSA assembly consists of 29,737 genes and an additional 45,756 isoforms. In the WGS and TSA assemblies, >80% and >95% of core genes can be found, though many in fragmented versions. The provided resources will allow researchers to conduct physiological experiments on A. royi, and also increase the possibilities for copepod gene set analysis, as it adds substantially to the copepod datasets available.
An improved direct metamobilome approach increases the detection of larger-sized circular elements across kingdoms

Alanin KWS, Jørgensen TS, Browne PD, Petersen B, Riber L, Kot W, et al.

Plasmid, 2021 05;115:102576.
PMID: 33872684 DOI: 10.1016/j.plasmid.2021.102576

Mobile genetic elements (MGEs) are instrumental in natural prokaryotic genome editing, permitting genome plasticity and allowing microbes to accumulate genetic diversity. MGEs serve as a vast communal gene pool and include DNA elements such as plasmids and bacteriophages (phages) among others. These mobile DNA elements represent a human health risk as they can introduce new traits, such as antibiotic resistance or virulence, to a bacterial strain. Sequencing libraries targeting environmental circular MGEs, referred to as metamobilomes, may broaden our current understanding of the mechanisms behind the mobility, prevalence and content of these elements. However, metamobilomics is affected by a severe bias towards small circular elements, introduced by multiple displacement amplification (MDA). MDA is typically used to overcome limiting DNA quantities after the removal of non-circular DNA during library preparations. By examining the relationship between sequencing coverage and the size of circular MGEs in paired metamobilome datasets with and without MDA, we show that larger circular elements are lost when using MDA. This study is the first to systematically demonstrate that MDA is detrimental to detecting larger-sized plasmids if small plasmids are present. It is also the first to show that MDA can be omitted when using enzyme-based DNA fragmentation and PCR in library preparation kits such as Nextera XT® from Illumina.
Large haploblocks underlie rapid adaptation in the invasive weed Ambrosia artemisiifolia

Battlay P, Wilson J, Bieker VC, Lee C, Prapas D, Petersen B, et al.

Nat Commun, 2023 Mar 27;14(1):1717.
PMID: 36973251 DOI: 10.1038/s41467-023-37303-4

Adaptation is the central feature and leading explanation for the evolutionary diversification of life. Adaptation is also notoriously difficult to study in nature, owing to its complexity and logistically prohibitive timescale. Here, we leverage extensive contemporary and historical collections of Ambrosia artemisiifolia-an aggressively invasive weed and primary cause of pollen-induced hayfever-to track the phenotypic and genetic causes of recent local adaptation across its native and invasive ranges in North America and Europe, respectively. Large haploblocks-indicative of chromosomal inversions-contain a disproportionate share (26%) of genomic regions conferring parallel adaptation to local climates between ranges, are associated with rapidly adapting traits, and exhibit dramatic frequency shifts over space and time. These results highlight the importance of large-effect standing variants in rapid adaptation, which have been critical to A. artemisiifolia's global spread across vast climatic gradients.
Fulltext SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences

Jorquera R, González C, Clausen PTLC, Petersen B, Holmes DS

Database (Oxford), 2021 01 28;2021.
PMID: 33507271 DOI: 10.1093/database/baab002

Single-exon coding sequences (CDSs), also known as 'single-exon genes' (SEGs), are defined as nuclear, protein-coding genes that lack introns in their CDSs. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancers and neurological/developmental disorders, and many exhibit tissue-specific transcription. We developed SinEx DB that houses DNA and protein sequence information of SEGs from 10 mammalian genomes including human. SinEx DB includes their functional predictions (KOG (euKaryotic Orthologous Groups)) and the relative distribution of these functions within species. Here, we report SinEx 2.0, a major update of SinEx DB that includes information of the occurrence, distribution and functional prediction of SEGs from 60 completely sequenced eukaryotic genomes, representing animals, fungi, protists and plants. The information is stored in a relational database built with MySQL Server 5.7, and the complete dataset of SEG sequences and their GO (Gene Ontology) functional assignations are available for downloading. SinEx DB 2.0 was built with a novel pipeline that helps disambiguate single-exon isoforms from SEGs. SinEx DB 2.0 is the largest available database for SEGs and provides a rich source of information for advancing our understanding of the evolution, function of SEGs and their associations with disorders including cancers and neurological and developmental diseases. Database URL: http://v2.sinex.cl/.
Fulltext Improved ontology for eukaryotic single-exon coding sequences in biological databases

Jorquera R, González C, Clausen P, Petersen B, Holmes DS

Database (Oxford), 2018 01 01;2018:1-6.
PMID: 30239665 DOI: 10.1093/database/bay089

Efficient extraction of knowledge from biological data requires the development of structured vocabularies to unambiguously define biological terms. This paper proposes descriptions and definitions to disambiguate the term 'single-exon gene'. Eukaryotic Single-Exon Genes (SEGs) have been defined as genes that do not have introns in their protein coding sequences. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancer and neurological/developmental disorders and many exhibit tissue-specific transcription. Unfortunately, the term 'SEGs' is rife with ambiguity, leading to biological misinterpretations. In the classic definition, no distinction is made between SEGs that harbor introns in their untranslated regions (UTRs) versus those without. This distinction is important to make because the presence of introns in UTRs affects transcriptional regulation and post-transcriptional processing of the mRNA. In addition, recent whole-transcriptome shotgun sequencing has led to the discovery of many examples of single-exon mRNAs that arise from alternative splicing of multi-exon genes, these single-exon isoforms are being confused with SEGs despite their clearly different origin. The increasing expansion of RNA-seq datasets makes it imperative to distinguish the different SEG types before annotation errors become indelibly propagated in biological databases. This paper develops a structured vocabulary for their disambiguation, allowing a major reassessment of their evolutionary trajectories, regulation, RNA processing and transport, and provides the opportunity to improve the detection of gene associations with disorders including cancers, neurological and developmental diseases.
Fulltext Comparative analyses identify genomic features potentially involved in the evolution of birds-of-paradise

Prost S, Armstrong EE, Nylander J, Thomas GWC, Suh A, Petersen B, et al.

Gigascience, 2019 May 01;8(5).
PMID: 30689847 DOI: 10.1093/gigascience/giz003

The diverse array of phenotypes and courtship displays exhibited by birds-of-paradise have long fascinated scientists and nonscientists alike. Remarkably, almost nothing is known about the genomics of this iconic radiation. There are 41 species in 16 genera currently recognized within the birds-of-paradise family (Paradisaeidae), most of which are endemic to the island of New Guinea. In this study, we sequenced genomes of representatives from all five major clades within this family to characterize genomic changes that may have played a role in the evolution of the group's extensive phenotypic diversity. We found genes important for coloration, morphology, and feather and eye development to be under positive selection. In birds-of-paradise with complex lekking systems and strong sexual dimorphism, the core birds-of-paradise, we found Gene Ontology categories for "startle response" and "olfactory receptor activity" to be enriched among the gene families expanding significantly faster compared to the other birds in our study. Furthermore, we found novel families of retrovirus-like retrotransposons active in all three de novo genomes since the early diversification of the birds-of-paradise group, which might have played a role in the evolution of this fascinating group of birds.

Filters

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links