Affiliations 

  • 1 Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
  • 2 Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. cts@sanger.ac.uk
  • 3 Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK. ylx@sanger.ac.uk
Hum Genet, 2018 Jan;137(1):73-83.
PMID: 29209947 DOI: 10.1007/s00439-017-1857-9

Abstract

We describe the variation in copy number of a ~ 10 kb region overlapping the long intergenic noncoding RNA (lincRNA) gene, TTTY22, within the IR3 inverted repeat on the short arm of the human Y chromosome, leading to individuals with 0-3 copies of this region in the general population. Variation of this CNV is common, with 266 individuals having 0 copies, 943 (including the reference sequence) having 1, 23 having 2 copies, and two having 3 copies, and was validated by breakpoint PCR, fibre-FISH, and 10× Genomics Chromium linked-read sequencing in subsets of 1234 individuals from the 1000 Genomes Project. Mapping the changes in copy number to the phylogeny of these Y chromosomes previously established by the Project identified at least 20 mutational events, and investigation of flanking paralogous sequence variants showed that the mutations involved flanking sequences in 18 of these, and could extend over > 30 kb of DNA. While either gene conversion or double crossover between misaligned sister chromatids could formally explain the 0-2 copy events, gene conversion is the more likely mechanism, and these events include the longest non-allelic gene conversion reported thus far. Chromosomes with three copies of this CNV have arisen just once in our data set via another mechanism: duplication of 420 kb that places the third copy 230 kb proximal to the existing proximal copy. Our results establish gene conversion as a previously under-appreciated mechanism of generating copy number changes in humans and reveal the exceptionally large size of the conversion events that can occur.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.