We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.
Humans reached present-day Island Southeast Asia (ISEA) in one of the first major human migrations out of Africa. Population movements in the millennia following this initial settlement are thought to have greatly influenced the genetic makeup of current inhabitants, yet the extent attributed to different events is not clear. Recent studies suggest that south-to-north gene flow largely influenced present-day patterns of genetic variation in Southeast Asian populations and that late Pleistocene and early Holocene migrations from Southeast Asia are responsible for a substantial proportion of ISEA ancestry. Archaeological and linguistic evidence suggests that the ancestors of present-day inhabitants came mainly from north-to-south migrations from Taiwan and throughout ISEA approximately 4,000 years ago. We report a large-scale genetic analysis of human variation in the Iban population from the Malaysian state of Sarawak in northwestern Borneo, located in the center of ISEA. Genome-wide single-nucleotide polymorphism (SNP) markers analyzed here suggest that the Iban exhibit greatest genetic similarity to Indonesian and mainland Southeast Asian populations. The most common non-recombining Y (NRY) and mitochondrial (mt) DNA haplogroups present in the Iban are associated with populations of Southeast Asia. We conclude that migrations from Southeast Asia made a large contribution to Iban ancestry, although evidence of potential gene flow from Taiwan is also seen in uniparentally inherited marker data.