Displaying publications 1 - 20 of 30 in total

Abstract:
Sort:
  1. Abubaker A, Baharum A, Alrefaei M
    PLoS One, 2015;10(7):e0130995.
    PMID: 26132309 DOI: 10.1371/journal.pone.0130995
    This paper puts forward a new automatic clustering algorithm based on Multi-Objective Particle Swarm Optimization and Simulated Annealing, "MOPSOSA". The proposed algorithm is capable of automatic clustering which is appropriate for partitioning datasets to a suitable number of clusters. MOPSOSA combines the features of the multi-objective based particle swarm optimization (PSO) and the Multi-Objective Simulated Annealing (MOSA). Three cluster validity indices were optimized simultaneously to establish the suitable number of clusters and the appropriate clustering for a dataset. The first cluster validity index is centred on Euclidean distance, the second on the point symmetry distance, and the last cluster validity index is based on short distance. A number of algorithms have been compared with the MOPSOSA algorithm in resolving clustering problems by determining the actual number of clusters and optimal clustering. Computational experiments were carried out to study fourteen artificial and five real life datasets.
    Matched MeSH terms: Datasets as Topic*
  2. Aqra I, Herawan T, Abdul Ghani N, Akhunzada A, Ali A, Bin Razali R, et al.
    PLoS One, 2018;13(1):e0179703.
    PMID: 29351287 DOI: 10.1371/journal.pone.0179703
    Designing an efficient association rule mining (ARM) algorithm for multilevel knowledge-based transactional databases that is appropriate for real-world deployments is of paramount concern. However, dynamic decision making that needs to modify the threshold either to minimize or maximize the output knowledge certainly necessitates the extant state-of-the-art algorithms to rescan the entire database. Subsequently, the process incurs heavy computation cost and is not feasible for real-time applications. The paper addresses efficiently the problem of threshold dynamic updation for a given purpose. The paper contributes by presenting a novel ARM approach that creates an intermediate itemset and applies a threshold to extract categorical frequent itemsets with diverse threshold values. Thus, improving the overall efficiency as we no longer needs to scan the whole database. After the entire itemset is built, we are able to obtain real support without the need of rebuilding the itemset (e.g. Itemset list is intersected to obtain the actual support). Moreover, the algorithm supports to extract many frequent itemsets according to a pre-determined minimum support with an independent purpose. Additionally, the experimental results of our proposed approach demonstrate the capability to be deployed in any mining system in a fully parallel mode; consequently, increasing the efficiency of the real-time association rules discovery process. The proposed approach outperforms the extant state-of-the-art and shows promising results that reduce computation cost, increase accuracy, and produce all possible itemsets.
    Matched MeSH terms: Datasets as Topic*
  3. Mohammed MF, Lim CP
    Neural Netw, 2017 Feb;86:69-79.
    PMID: 27890606 DOI: 10.1016/j.neunet.2016.10.012
    In this paper, we extend our previous work on the Enhanced Fuzzy Min-Max (EFMM) neural network by introducing a new hyperbox selection rule and a pruning strategy to reduce network complexity and improve classification performance. Specifically, a new k-nearest hyperbox expansion rule (for selection of a new winning hyperbox) is first introduced to reduce the network complexity by avoiding the creation of too many small hyperboxes within the vicinity of the winning hyperbox. A pruning strategy is then deployed to further reduce the network complexity in the presence of noisy data. The effectiveness of the proposed network is evaluated using a number of benchmark data sets. The results compare favorably with those from other related models. The findings indicate that the newly introduced hyperbox winner selection rule coupled with the pruning strategy are useful for undertaking pattern classification problems.
    Matched MeSH terms: Datasets as Topic/classification*
  4. Shirkhorshidi AS, Aghabozorgi S, Wah TY
    PLoS One, 2015;10(12):e0144059.
    PMID: 26658987 DOI: 10.1371/journal.pone.0144059
    Similarity or distance measures are core components used by distance-based clustering algorithms to cluster similar data points into the same clusters, while dissimilar or distant data points are placed into different clusters. The performance of similarity measures is mostly addressed in two or three-dimensional spaces, beyond which, to the best of our knowledge, there is no empirical study that has revealed the behavior of similarity measures when dealing with high-dimensional datasets. To fill this gap, a technical framework is proposed in this study to analyze, compare and benchmark the influence of different similarity measures on the results of distance-based clustering algorithms. For reproducibility purposes, fifteen publicly available datasets were used for this study, and consequently, future distance measures can be evaluated and compared with the results of the measures discussed in this work. These datasets were classified as low and high-dimensional categories to study the performance of each measure against each category. This research should help the research community to identify suitable distance measures for datasets and also to facilitate a comparison and evaluation of the newly proposed similarity or distance measures with traditional ones.
    Matched MeSH terms: Datasets as Topic*
  5. Cuttiford L, Pimsler ML, Heo CC, Zheng L, Karunaratne I, Trissini G, et al.
    J Med Entomol, 2021 07 16;58(4):1654-1662.
    PMID: 33970239 DOI: 10.1093/jme/tjab081
    A basic tenet of forensic entomology is development data of an insect can be used to predict the time of colonization (TOC) by insect specimens collected from remains, and this prediction is related to the time of death and/or time of placement (TOP). However, few datasets have been evaluated to determine their accuracy or precision. The black soldier fly, Hermetia illucens (L.) (Diptera: Stratiomyidae) is recognized as an insect of forensic importance. This study examined the accuracy and precision of several development datasets for the black soldier fly by estimating the TOP of five sets of human and three sets of swine remains in San Marcos and College Station, TX, respectively. Data generated from this study indicate only one of these datasets consistently (time-to-prepupae 52%; time-to-eclosion 75%) produced TOP estimations that occurred within a day of the actual TOP of the remains. It is unknown if the precolonization interval (PreCI) of this species is long, but it has been observed that the species can colonize within 6 d after death. This assumption remains untested by validation studies. Accounting for this PreCI improved accuracy for the time-to-prepupae group, but reduced accuracy in the time-to-eclosion group. The findings presented here highlight a need for detailed, forensic-based development data for the black soldier fly that can reliably and accurately be used in casework. Finally, this study outlines the need for a basic understanding of the timing of resource utilization (i.e., duration of the PreCI) for forensically relevant taxa so that reasonable corrections may be made to TOC as related to minimum postmortem interval (mPMI) estimates.
    Matched MeSH terms: Datasets as Topic
  6. AlDahoul N, Karim HA, Momo MA, Escobar FIF, Magallanes VA, Tan MJT
    Sci Rep, 2023 Sep 02;13(1):14475.
    PMID: 37660120 DOI: 10.1038/s41598-023-41711-3
    Intestinal parasitic infections (IPIs) caused by protozoan and helminth parasites are among the most common infections in humans in low-and-middle-income countries. IPIs affect not only the health status of a country, but also the economic sector. Over the last decade, pattern recognition and image processing techniques have been developed to automatically identify parasitic eggs in microscopic images. Existing identification techniques are still suffering from diagnosis errors and low sensitivity. Therefore, more accurate and faster solution is still required to recognize parasitic eggs and classify them into several categories. A novel Chula-ParasiteEgg dataset including 11,000 microscopic images proposed in ICIP2022 was utilized to train various methods such as convolutional neural network (CNN) based models and convolution and attention (CoAtNet) based models. The experiments conducted show high recognition performance of the proposed CoAtNet that was tuned with microscopic images of parasitic eggs. The CoAtNet produced an average accuracy of 93%, and an average F1 score of 93%. The finding opens door to integrate the proposed solution in automated parasitological diagnosis.
    Matched MeSH terms: Datasets as Topic
  7. Yuan B, Nishiura H
    PLoS One, 2018;13(6):e0198734.
    PMID: 29924819 DOI: 10.1371/journal.pone.0198734
    BACKGROUND: Frequent international travel facilitates the global spread of dengue fever. Japan has experienced an increasing number of imported case notifications of dengue virus (DENV) infection, mostly arising from Japanese travelers visiting South and Southeast Asian countries. This has led an autochthonous dengue outbreak in 2014 in Japan. The present study aimed to infer the risk of DENV infection among Japanese travelers to Asian countries, thereby obtaining an actual estimate of the number of DENV infections among travelers.

    METHODOLOGY/PRINCIPAL FINDINGS: For eight destination countries (Indonesia, Philippines, Thailand, India, Malaysia, Vietnam, Sri Lanka, and Singapore), we collected age-dependent seroepidemiological data. We also retrieved the number of imported cases, who were notified to the Japanese government, as well as the total number of travelers to each destination. Using a mathematical model, we estimated the force of infection in each destination country with seroepidemiological data while jointly inferring the reporting coverage of DENV infections among Japanese travelers from datasets of imported cases and travelers. Assuming that travelers had a risk of infection that was identical to that of the local population during travel, the reporting coverage of dengue appeared to range from 0.6% to 4.3%. The risk of infection per journey ranged from 0.02% to 0.44%.

    CONCLUSIONS/SIGNIFICANCE: We found that the actual number of imported cases of DENV infection among Japanese travelers could be more than 20 times the notified number of imported cases. This finding may be attributed to the substantial proportion of asymptomatic and under-ascertained infections.

    Matched MeSH terms: Datasets as Topic/statistics & numerical data
  8. Swami V, Furnham A, Horne G, Stieger S
    Body Image, 2020 Sep;34:155-166.
    PMID: 32593946 DOI: 10.1016/j.bodyim.2020.05.004
    Issues of construct commonality and distinguishability in body image research are typically addressed using structural equal models, but such methods can sometimes present problems of interpretation when data patterns are complex. One recent-developed tool that could help in summarising complex data patterns is Item Pool Visualisation (IPV), an illustrative method that locates item pools from within the same dataset and illustrates these in the form of single or nested radar charts. Here, we demonstrate the utility of IPV in visualising data patterns vis-à-vis positive body image. Five-hundred-and-one adults from the United Kingdom completed seven widely-used measures of positive body image and data were subjected IPV. Results demonstrated that, of the included measures, the Body Appreciation Scale-2 provided the closest and most precise measurement of a core positive body image construct. The Functionality Appreciation Scale and the Authentic Pride subscale of the Body and Appearance Self-Conscious Emotions Scale tapped more distal aspects. Our results also highlight possible limitations with the use of several other instruments as measures of positive body image. We discuss implications for research aimed at better understanding the nature of positive body image and interpreting complex data patterns in body image research more generally.
    Matched MeSH terms: Datasets as Topic
  9. Alsalem MA, Zaidan AA, Zaidan BB, Hashim M, Madhloom HT, Azeez ND, et al.
    Comput Methods Programs Biomed, 2018 May;158:93-112.
    PMID: 29544792 DOI: 10.1016/j.cmpb.2018.02.005
    CONTEXT: Acute leukaemia diagnosis is a field requiring automated solutions, tools and methods and the ability to facilitate early detection and even prediction. Many studies have focused on the automatic detection and classification of acute leukaemia and their subtypes to promote enable highly accurate diagnosis.

    OBJECTIVE: This study aimed to review and analyse literature related to the detection and classification of acute leukaemia. The factors that were considered to improve understanding on the field's various contextual aspects in published studies and characteristics were motivation, open challenges that confronted researchers and recommendations presented to researchers to enhance this vital research area.

    METHODS: We systematically searched all articles about the classification and detection of acute leukaemia, as well as their evaluation and benchmarking, in three main databases: ScienceDirect, Web of Science and IEEE Xplore from 2007 to 2017. These indices were considered to be sufficiently extensive to encompass our field of literature.

    RESULTS: Based on our inclusion and exclusion criteria, 89 articles were selected. Most studies (58/89) focused on the methods or algorithms of acute leukaemia classification, a number of papers (22/89) covered the developed systems for the detection or diagnosis of acute leukaemia and few papers (5/89) presented evaluation and comparative studies. The smallest portion (4/89) of articles comprised reviews and surveys.

    DISCUSSION: Acute leukaemia diagnosis, which is a field requiring automated solutions, tools and methods, entails the ability to facilitate early detection or even prediction. Many studies have been performed on the automatic detection and classification of acute leukaemia and their subtypes to promote accurate diagnosis.

    CONCLUSIONS: Research areas on medical-image classification vary, but they are all equally vital. We expect this systematic review to help emphasise current research opportunities and thus extend and create additional research fields.

    Matched MeSH terms: Datasets as Topic
  10. Kundu R, Basak H, Singh PK, Ahmadian A, Ferrara M, Sarkar R
    Sci Rep, 2021 Jul 08;11(1):14133.
    PMID: 34238992 DOI: 10.1038/s41598-021-93658-y
    COVID-19 has crippled the world's healthcare systems, setting back the economy and taking the lives of several people. Although potential vaccines are being tested and supplied around the world, it will take a long time to reach every human being, more so with new variants of the virus emerging, enforcing a lockdown-like situation on parts of the world. Thus, there is a dire need for early and accurate detection of COVID-19 to prevent the spread of the disease, even more. The current gold-standard RT-PCR test is only 71% sensitive and is a laborious test to perform, leading to the incapability of conducting the population-wide screening. To this end, in this paper, we propose an automated COVID-19 detection system that uses CT-scan images of the lungs for classifying the same into COVID and Non-COVID cases. The proposed method applies an ensemble strategy that generates fuzzy ranks of the base classification models using the Gompertz function and fuses the decision scores of the base models adaptively to make the final predictions on the test cases. Three transfer learning-based convolutional neural network models are used, namely VGG-11, Wide ResNet-50-2, and Inception v3, to generate the decision scores to be fused by the proposed ensemble model. The framework has been evaluated on two publicly available chest CT scan datasets achieving state-of-the-art performance, justifying the reliability of the model. The relevant source codes related to the present work is available in: GitHub.
    Matched MeSH terms: Datasets as Topic
  11. Saha P, Mukherjee D, Singh PK, Ahmadian A, Ferrara M, Sarkar R
    Sci Rep, 2021 04 15;11(1):8304.
    PMID: 33859222 DOI: 10.1038/s41598-021-87523-1
    COVID-19, a viral infection originated from Wuhan, China has spread across the world and it has currently affected over 115 million people. Although vaccination process has already started, reaching sufficient availability will take time. Considering the impact of this widespread disease, many research attempts have been made by the computer scientists to screen the COVID-19 from Chest X-Rays (CXRs) or Computed Tomography (CT) scans. To this end, we have proposed GraphCovidNet, a Graph Isomorphic Network (GIN) based model which is used to detect COVID-19 from CT-scans and CXRs of the affected patients. Our proposed model only accepts input data in the form of graph as we follow a GIN based architecture. Initially, pre-processing is performed to convert an image data into an undirected graph to consider only the edges instead of the whole image. Our proposed GraphCovidNet model is evaluated on four standard datasets: SARS-COV-2 Ct-Scan dataset, COVID-CT dataset, combination of covid-chestxray-dataset, Chest X-Ray Images (Pneumonia) dataset and CMSC-678-ML-Project dataset. The model shows an impressive accuracy of 99% for all the datasets and its prediction capability becomes 100% accurate for the binary classification problem of detecting COVID-19 scans. Source code of this work can be found at GitHub-link .
    Matched MeSH terms: Datasets as Topic
  12. Ibáñez O, Vicente R, Navega DS, Wilkinson C, Jayaprakash PT, Huete MI, et al.
    Forensic Sci Int, 2015 Dec;257:496-503.
    PMID: 26060056 DOI: 10.1016/j.forsciint.2015.05.030
    As part of the scientific tasks coordinated throughout The 'New Methodologies and Protocols of Forensic Identification by Craniofacial Superimposition (MEPROCS)' project, the current study aims to analyse the performance of a diverse set of CFS methodologies and the corresponding technical approaches when dealing with a common dataset of real-world cases. Thus, a multiple-lab study on craniofacial superimposition has been carried out for the first time. In particular, 26 participants from 17 different institutions in 13 countries were asked to deal with 14 identification scenarios, some of them involving the comparison of multiple candidates and unknown skulls. In total, 60 craniofacial superimposition problems divided in two set of females and males. Each participant follow her/his own methodology and employed her/his particular technological means. For each single case they were asked to report the final identification decision (either positive or negative) along with the rationale supporting the decision and at least one image illustrating the overlay/superimposition outcome. This study is expected to provide important insights to better understand the most convenient characteristics of every method included in this study.
    Matched MeSH terms: Datasets as Topic
  13. Wan Ahmad WS, Zaki WM, Ahmad Fauzi MF
    Biomed Eng Online, 2015;14:20.
    PMID: 25889188 DOI: 10.1186/s12938-015-0014-8
    Unsupervised lung segmentation method is one of the mandatory processes in order to develop a Content Based Medical Image Retrieval System (CBMIRS) of CXR. The purpose of the study is to present a robust solution for lung segmentation of standard and mobile chest radiographs using fully automated unsupervised method.
    Matched MeSH terms: Datasets as Topic
  14. Kofi AE, Hakim HM, Khan HO, Ismail SA, Ghansah A, David AA, et al.
    Int J Legal Med, 2020 Jul;134(4):1313-1315.
    PMID: 31154498 DOI: 10.1007/s00414-019-02099-w
    In this study, 268 samples for unrelated males belonging to the five major human subpopulation groups in Ghana (Akan, Ewe, Mole-Dagbon, Ga-Dangme and Guang) were genetically characterised for 23 Y chromosome short tandem repeat (STR) loci using the Powerplex® Y23 STR kit. A total of 263 complete haplotypes were recorded of which 258 were unique. The haplotype diversity, discriminating capacity and match probability for the pooled population data were 0.9998, 0.9627 and 0.0039, respectively. The pairwise genetic distance (RST) for the Ghanaian datasets and other reference populations deposited in the Y-STR Haplotype Reference Database (YHRD) were estimated and mapped using multidimensional scaling (MDS) plot. The Guang and Ewe were significantly different from the Akan, Mole-Dagbon and Ga-Dangme. However, the five Ghanaian datasets were all plotted close together with other African populations in the MDS data mapping.
    Matched MeSH terms: Datasets as Topic
  15. Cacha LA, Parida S, Dehuri S, Cho SB, Poznanski RR
    J Integr Neurosci, 2016 Dec;15(4):593-606.
    PMID: 28093025 DOI: 10.1142/S0219635216500345
    The huge number of voxels in fMRI over time poses a major challenge to for effective analysis. Fast, accurate, and reliable classifiers are required for estimating the decoding accuracy of brain activities. Although machine-learning classifiers seem promising, individual classifiers have their own limitations. To address this limitation, the present paper proposes a method based on the ensemble of neural networks to analyze fMRI data for cognitive state classification for application across multiple subjects. Similarly, the fuzzy integral (FI) approach has been employed as an efficient tool for combining different classifiers. The FI approach led to the development of a classifiers ensemble technique that performs better than any of the single classifier by reducing the misclassification, the bias, and the variance. The proposed method successfully classified the different cognitive states for multiple subjects with high accuracy of classification. Comparison of the performance improvement, while applying ensemble neural networks method, vs. that of the individual neural network strongly points toward the usefulness of the proposed method.
    Matched MeSH terms: Datasets as Topic
  16. Du L, Pang Y
    Sci Rep, 2021 06 24;11(1):13275.
    PMID: 34168200 DOI: 10.1038/s41598-021-92484-6
    Influenza is an infectious disease that leads to an estimated 5 million cases of severe illness and 650,000 respiratory deaths worldwide each year. The early detection and prediction of influenza outbreaks are crucial for efficient resource planning to save patient's lives and healthcare costs. We propose a new data-driven methodology for influenza outbreak detection and prediction at very local levels. A doctor's diagnostic dataset of influenza-like illness from more than 3000 clinics in Malaysia is used in this study because these diagnostic data are reliable and can be captured promptly. A new region index (RI) of the influenza outbreak is proposed based on the diagnostic dataset. By analysing the anomalies in the weekly RI value, potential outbreaks are identified using statistical methods. An ensemble learning method is developed to predict potential influenza outbreaks. Cross-validation is conducted to optimize the hyperparameters of the ensemble model. A testing data set is used to provide an unbiased evaluation of the model. The proposed methodology is shown to be sensitive and accurate at influenza outbreak prediction, with average of 75% recall, 74% precision, and 83% accuracy scores across five regions in Malaysia. The results are also validated by Google Flu Trends data, news reports, and surveillance data released by World Health Organization.
    Matched MeSH terms: Datasets as Topic
  17. Feng S, Stiller J, Deng Y, Armstrong J, Fang Q, Reeve AH, et al.
    Nature, 2020 11;587(7833):252-257.
    PMID: 33177665 DOI: 10.1038/s41586-020-2873-9
    Whole-genome sequencing projects are increasingly populating the tree of life and characterizing biodiversity1-4. Sparse taxon sampling has previously been proposed to confound phylogenetic inference5, and captures only a fraction of the genomic diversity. Here we report a substantial step towards the dense representation of avian phylogenetic and molecular diversity, by analysing 363 genomes from 92.4% of bird families-including 267 newly sequenced genomes produced for phase II of the Bird 10,000 Genomes (B10K) Project. We use this comparative genome dataset in combination with a pipeline that leverages a reference-free whole-genome alignment to identify orthologous regions in greater numbers than has previously been possible and to recognize genomic novelties in particular bird lineages. The densely sampled alignment provides a single-base-pair map of selection, has more than doubled the fraction of bases that are confidently predicted to be under conservation and reveals extensive patterns of weak selection in predominantly non-coding DNA. Our results demonstrate that increasing the diversity of genomes used in comparative studies can reveal more shared and lineage-specific variation, and improve the investigation of genomic characteristics. We anticipate that this genomic resource will offer new perspectives on evolutionary processes in cross-species comparative analyses and assist in efforts to conserve species.
    Matched MeSH terms: Datasets as Topic
  18. Tang PW, Choon YW, Mohamad MS, Deris S, Napis S
    J Biosci Bioeng, 2015 Mar;119(3):363-8.
    PMID: 25216804 DOI: 10.1016/j.jbiosc.2014.08.004
    Metabolic engineering is a research field that focuses on the design of models for metabolism, and uses computational procedures to suggest genetic manipulation. It aims to improve the yield of particular chemical or biochemical products. Several traditional metabolic engineering methods are commonly used to increase the production of a desired target, but the products are always far below their theoretical maximums. Using numeral optimisation algorithms to identify gene knockouts may stall at a local minimum in a multivariable function. This paper proposes a hybrid of the artificial bee colony (ABC) algorithm and the minimisation of metabolic adjustment (MOMA) to predict an optimal set of solutions in order to optimise the production rate of succinate and lactate. The dataset used in this work was from the iJO1366 Escherichia coli metabolic network. The experimental results include the production rate, growth rate and a list of knockout genes. From the comparative analysis, ABCMOMA produced better results compared to previous works, showing potential for solving genetic engineering problems.
    Matched MeSH terms: Datasets as Topic
  19. Tan MP, Tan GJ, Mat S, Luben RN, Wareham NJ, Khaw KT, et al.
    Drugs Aging, 2020 02;37(2):105-114.
    PMID: 31808140 DOI: 10.1007/s40266-019-00731-3
    The consumption of medications with anticholinergic activity has been suggested to result in the adverse effects of mental confusion, visual disturbance, and muscle weakness, which may lead to falls. Existing published evidence linking anticholinergic drugs with falls, however, remains weak. This study was conducted to evaluate the relationship between anticholinergic cognitive burden (ACB) and the long-term risk of hospitalization with falls and fractures in a large population study. The dataset comprised information from 25,639 men and women (aged 40-79 years) recruited from 1993 to 1997 from Norfolk, United Kingdom into the European Prospective Investigation into Cancer (EPIC)-Norfolk study. The time to first hospital admission with a fall with or without fracture was obtained from the National Health Service hospital information system. Cox-proportional hazards analyses were conducted to adjust for confounders and competing risks. The fall hospitalization rate was 5.8% over a median follow-up of ~ 19.4 years. The unadjusted incidence rate ratio for the use of any drugs with anticholinergic properties was 1.79 (95% CI 1.66-1.93). The hazard ratios (95% CI) for ACB scores of 1, 2-3, and ≥ 4 compared with ACB = 0 for fall hospitalization were 1.20 (1.09-1.33), 1.42 (1.25-1.60), and 1.39 (1.21-1.60) after adjustment for age, gender, medical conditions, physical activity, and blood pressure. Medications with anticholinergic activity are associated with an increased risk of subsequent hospitalization with a fall over a 19-year follow-up period. The biological mechanisms underlying the long-term risk of hospitalization with a fall or fracture following baseline ACB exposure remains unclear and requires further evaluation.
    Matched MeSH terms: Datasets as Topic
  20. Malaspinas AS, Westaway MC, Muller C, Sousa VC, Lao O, Alves I, et al.
    Nature, 2016 Oct 13;538(7624):207-214.
    PMID: 27654914 DOI: 10.1038/nature18299
    The population history of Aboriginal Australians remains largely uncharacterized. Here we generate high-coverage genomes for 83 Aboriginal Australians (speakers of Pama-Nyungan languages) and 25 Papuans from the New Guinea Highlands. We find that Papuan and Aboriginal Australian ancestors diversified 25-40 thousand years ago (kya), suggesting pre-Holocene population structure in the ancient continent of Sahul (Australia, New Guinea and Tasmania). However, all of the studied Aboriginal Australians descend from a single founding population that differentiated ~10-32 kya. We infer a population expansion in northeast Australia during the Holocene epoch (past 10,000 years) associated with limited gene flow from this region to the rest of Australia, consistent with the spread of the Pama-Nyungan languages. We estimate that Aboriginal Australians and Papuans diverged from Eurasians 51-72 kya, following a single out-of-Africa dispersal, and subsequently admixed with archaic populations. Finally, we report evidence of selection in Aboriginal Australians potentially associated with living in the desert.
    Matched MeSH terms: Datasets as Topic
Related Terms
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links