Displaying all 17 publications

  1. Schweiker M, Abdul-Zahra A, André M, Al-Atrash F, Al-Khatri H, Alprianti RR, et al.
    Sci Data, 2019 11 26;6(1):289.
    PMID: 31772199 DOI: 10.1038/s41597-019-0272-6
    Thermal discomfort is one of the main triggers for occupants' interactions with components of the built environment such as adjustments of thermostats and/or opening windows and strongly related to the energy use in buildings. Understanding causes for thermal (dis-)comfort is crucial for design and operation of any type of building. The assessment of human thermal perception through rating scales, for example in post-occupancy studies, has been applied for several decades; however, long-existing assumptions related to these rating scales had been questioned by several researchers. The aim of this study was to gain deeper knowledge on contextual influences on the interpretation of thermal perception scales and their verbal anchors by survey participants. A questionnaire was designed and consequently applied in 21 language versions. These surveys were conducted in 57 cities in 30 countries resulting in a dataset containing responses from 8225 participants. The database offers potential for further analysis in the areas of building design and operation, psycho-physical relationships between human perception and the built environment, and linguistic analyses.
  2. Nashwan MS, Shahid S, Chung ES
    Sci Data, 2019 07 31;6(1):138.
    PMID: 31366936 DOI: 10.1038/s41597-019-0144-0
    This study developed 0.05° × 0.05° land-only datasets of daily maximum and minimum temperatures in the densely populated Central North region of Egypt (CNE) for the period 1981-2017. Existing coarse-resolution datasets were evaluated to find the best dataset for the study area to use as a base of the new datasets. The Climate Prediction Centre (CPC) global temperature dataset was found to be the best. The CPC data were interpolated to a spatial resolution of 0.05° latitude/longitude using linear interpolation technique considering the flat topography of the study area. The robust kernel density distribution mapping method was used to correct the bias using observations, and WorldClim v.2 temperature climatology was used to adjust the spatial variability in temperature. The validation of CNE datasets using probability density function skill score and hot and cold extremes tail skill scores showed remarkable improvement in replicating the spatial and temporal variability in observed temperature. Because CNE datasets are the best available high-resolution estimate of daily temperatures, they will be beneficial for climatic and hydrological studies.
  3. Pastorello G, Trotta C, Canfora E, Chu H, Christianson D, Cheah YW, et al.
    Sci Data, 2020 07 09;7(1):225.
    PMID: 32647314 DOI: 10.1038/s41597-020-0534-3
    The FLUXNET2015 dataset provides ecosystem-scale data on CO2, water, and energy exchange between the biosphere and the atmosphere, and other meteorological and biological measurements, from 212 sites around the globe (over 1500 site-years, up to and including year 2014). These sites, independently managed and operated, voluntarily contributed their data to create global datasets. Data were quality controlled and processed using uniform methods, to improve consistency and intercomparability across sites. The dataset is already being used in a number of applications, including ecophysiology studies, remote sensing studies, and development of ecosystem and Earth system models. FLUXNET2015 includes derived-data products, such as gap-filled time series, ecosystem respiration and photosynthetic uptake estimates, estimation of uncertainties, and metadata about the measurements, presented for the first time in this paper. In addition, 206 of these sites are for the first time distributed under a Creative Commons (CC-BY 4.0) license. This paper details this enhanced dataset and the processing methods, now made available as open-source codes, making the dataset more accessible, transparent, and reproducible.
  4. Pastorello G, Trotta C, Canfora E, Chu H, Christianson D, Cheah YW, et al.
    Sci Data, 2021 Feb 25;8(1):72.
    PMID: 33633116 DOI: 10.1038/s41597-021-00851-9
  5. Danylo O, Pirker J, Lemoine G, Ceccherini G, See L, McCallum I, et al.
    Sci Data, 2021 03 30;8(1):96.
    PMID: 33785753 DOI: 10.1038/s41597-021-00867-1
    In recent decades, global oil palm production has shown an abrupt increase, with almost 90% produced in Southeast Asia alone. To understand trends in oil palm plantation expansion and for landscape-level planning, accurate maps are needed. Although different oil palm maps have been produced using remote sensing in the past, here we use Sentinel 1 imagery to generate an oil palm plantation map for Indonesia, Malaysia and Thailand for the year 2017. In addition to location, the age of the oil palm plantation is critical for calculating yields. Here we have used a Landsat time series approach to determine the year in which the oil palm plantations are first detected, at which point they are 2 to 3 years of age. From this, the approximate age of the oil palm plantation in 2017 can be derived.
  6. Fox Ramos AE, Le Pogam P, Fox Alcover C, Otogo N'Nang E, Cauchie G, Hazni H, et al.
    Sci Data, 2019 04 03;6(1):15.
    PMID: 30944327 DOI: 10.1038/s41597-019-0028-3
    This Data Descriptor announces the submission to public repositories of the monoterpene indole alkaloid database (MIADB), a cumulative collection of 172 tandem mass spectrometry (MS/MS) spectra from multiple research projects conducted in eight natural product chemistry laboratories since the 1960s. All data have been annotated and organized to promote reuse by the community. Being a unique collection of these complex natural products, these data can be used to guide the dereplication and targeting of new related monoterpene indole alkaloids within complex mixtures when applying computer-based approaches, such as molecular networking. Each spectrum has its own accession number from CCMSLIB00004679916 to CCMSLIB00004680087 on the GNPS. The MIADB is available for download from MetaboLights under the identifier: MTBLS142 ( https://www.ebi.ac.uk/metabolights/MTBLS142 ).
  7. Schweiker M, Abdul-Zahra A, André M, Al-Atrash F, Al-Khatri H, Alprianti RR, et al.
    Sci Data, 2020 01 06;7(1):11.
    PMID: 31907360 DOI: 10.1038/s41597-019-0348-3
    An amendment to this paper has been published and can be accessed via a link at the top of the paper.
  8. Tan JL, Simbun A, Chan KG, Ngeow YF
    Sci Data, 2020 05 05;7(1):135.
    PMID: 32371951 DOI: 10.1038/s41597-020-0475-x
    Mycobacterium tuberculosis (MTB) is commonly used as a model to study pathogenicity and multiple drug resistance in bacteria. These MTB characteristics are highly dependent on the evolution and phylogeography of the bacterium. In this paper, we describe 15 new genomes of multidrug-resistant MTB (MDRTB) from Malaysia. The assessments and annotations on the genome assemblies suggest that strain differences are due to lineages and horizontal gene transfer during the course of evolution. The genomes show mutations listed in current drug resistance databases and global MTB collections. This genome data will augment existing information available for comparative genomic studies to understand MTB drug resistance mechanisms and evolution.
  9. Baird AH, Guest JR, Edwards AJ, Bauman AG, Bouwmeester J, Mera H, et al.
    Sci Data, 2021 01 29;8(1):35.
    PMID: 33514754 DOI: 10.1038/s41597-020-00793-8
    The discovery of multi-species synchronous spawning of scleractinian corals on the Great Barrier Reef in the 1980s stimulated an extraordinary effort to document spawning times in other parts of the globe. Unfortunately, most of these data remain unpublished which limits our understanding of regional and global reproductive patterns. The Coral Spawning Database (CSD) collates much of these disparate data into a single place. The CSD includes 6178 observations (3085 of which were unpublished) of the time or day of spawning for over 300 scleractinian species in 61 genera from 101 sites in the Indo-Pacific. The goal of the CSD is to provide open access to coral spawning data to accelerate our understanding of coral reproductive biology and to provide a baseline against which to evaluate any future changes in reproductive phenology.
  10. Ravintheran SK, Sivaprakasam S, Loke S, Lee SY, Manickam R, Yahya A, et al.
    Sci Data, 2019 11 25;6(1):280.
    PMID: 31767854 DOI: 10.1038/s41597-019-0289-x
    Complete genomes of xenobiotic-degrading microorganisms provide valuable resources for researchers to understand molecular mechanisms involved in bioremediation. Despite the well-known ability of Sphingomonas paucimobilis to degrade persistent xenobiotic compounds, a complete genome sequencing is lacking for this organism. In line with this, we report the first complete genome sequence of Sphingomonas paucimobilis (strain AIMST S2), an organophosphate and hydrocarbon-degrading bacterium isolated from oil-polluted soil at Kedah, Malaysia. The genome was derived from a hybrid assembly of short and long reads generated by Illumina HiSeq and MinION, respectively. The assembly resulted in a single contig of 4,005,505 bases which consisted of 3,612 CDS and 56 tRNAs. An array of genes involved in xenobiotic degradation and plant-growth promoters were identified, suggesting its' potential role as an effective microorganism in bioremediation and agriculture. Having reported the first complete genome of the species, this study will serve as a stepping stone for comparative genome analysis of Sphingomonas strains and other xenobiotic-degrading microorganisms as well as gene expression studies in organophosphate biodegradation.
  11. Schepaschenko D, Chave J, Phillips OL, Lewis SL, Davies SJ, Réjou-Méchain M, et al.
    Sci Data, 2019 10 10;6(1):198.
    PMID: 31601817 DOI: 10.1038/s41597-019-0196-1
    Forest biomass is an essential indicator for monitoring the Earth's ecosystems and climate. It is a critical input to greenhouse gas accounting, estimation of carbon losses and forest degradation, assessment of renewable energy potential, and for developing climate change mitigation policies such as REDD+, among others. Wall-to-wall mapping of aboveground biomass (AGB) is now possible with satellite remote sensing (RS). However, RS methods require extant, up-to-date, reliable, representative and comparable in situ data for calibration and validation. Here, we present the Forest Observation System (FOS) initiative, an international cooperation to establish and maintain a global in situ forest biomass database. AGB and canopy height estimates with their associated uncertainties are derived at a 0.25 ha scale from field measurements made in permanent research plots across the world's forests. All plot estimates are geolocated and have a size that allows for direct comparison with many RS measurements. The FOS offers the potential to improve the accuracy of RS-based biomass products while developing new synergies between the RS and ground-based ecosystem research communities.
  12. Song YH, Chung ES, Shahid S, Kim Y, Kim D
    Sci Data, 2023 Aug 26;10(1):568.
    PMID: 37633988 DOI: 10.1038/s41597-023-02475-7
    Reliable projection of evapotranspiration (ET) is important for planning sustainable water management for the agriculture field in the context of climate change. A global dataset of monthly climate variables was generated to estimate potential ET (PET) using 14 General Circulation Models (GCMs) for four main shared socioeconomic pathways (SSPs). The generated dataset has a spatial resolution of 0.5° × 0.5° and a period ranging from 1950 to 2100 and can estimate historical and future PET using the Penman-Monteith method. Furthermore, this dataset can be applied to various PET estimation methods based on climate variables. This paper presents that the dataset generated to estimate future PET could reflect the greenhouse gas concentration level of the SSP scenarios in latitude bands. Therefore, this dataset can provide vital information for users to select appropriate GCMs for estimating reasonable PETs and help determine bias correction methods to reduce between observation and model based on the scale of climate variables in each GCM.
  13. Wei J, Xiao Y, Liu J, Herrera-Ulloa A, Loh KH, Xu K
    Sci Data, 2024 Feb 23;11(1):234.
    PMID: 38395996 DOI: 10.1038/s41597-024-03070-0
    Pampus argenteus (Euphrasen, 1788) is one of the major fishery species in coastal China. Pampus argenteus has a highly specialized morphology, and its declining fishery resources have encouraged massive research efforts on its aquacultural biology. In this study, we reported the first high-quality chromosome-level genome of P. argenteus obtained by integrating Illumina, PacBio HiFi, and Hi-C sequencing techniques. The final size of the genome was 518.06 Mb, with contig and scaffold N50 values of 20.47 and 22.86 Mb, respectively. The sequences were anchored and oriented onto 24 pseudochromosomes based on Hi-C data corresponding to the 24-chromatid karyotype of P. argenteus. A colinear relationship was observed between the P. argenteus genome and that of a closely related species (Scomber japonicus). A total of 24,696 protein-coding genes were identified from the genome, 98.9% of which were complete BUSCOs. This report represents the first case of high-quality chromosome-level genome assembly for P. argenteus and can provide valuable information for future evolutionary, conservation, and aquacultural research.
  14. Kozlov SA, Lazarev VN, Kostryukova ES, Selezneva OV, Ospanova EA, Alexeev DG, et al.
    Sci Data, 2014;1:140023.
    PMID: 25977780 DOI: 10.1038/sdata.2014.23
    A comprehensive transcriptome analysis of an expressed sequence tag (EST) database of the spider Dolomedes fimbriatus venom glands using single-residue distribution analysis (SRDA) identified 7,169 unique sequences. Mature chains of 163 different toxin-like polypeptides were predicted on the basis of well-established methodology. The number of protein precursors of these polypeptides was appreciably numerous than the number of mature polypeptides. A total of 451 different polypeptide precursors, translated from 795 unique nucleotide sequences, were deduced. A homology search divided the 163 mature polypeptide sequences into 16 superfamilies and 19 singletons. The number of mature toxins in a superfamily ranged from 2 to 49, whereas the diversity of the original nucleotide sequences was greater (2-261 variants). We observed a predominance of inhibitor cysteine knot toxin-like polypeptides among the cysteine-containing structures in the analyzed transcriptome bank. Uncommon spatial folds were also found.
  15. Buchanan EM, Lewis SC, Paris B, Forscher PS, Pavlacic JM, Beshears JE, et al.
    Sci Data, 2023 Feb 11;10(1):87.
    PMID: 36774440 DOI: 10.1038/s41597-022-01811-7
    In response to the COVID-19 pandemic, the Psychological Science Accelerator coordinated three large-scale psychological studies to examine the effects of loss-gain framing, cognitive reappraisals, and autonomy framing manipulations on behavioral intentions and affective measures. The data collected (April to October 2020) included specific measures for each experimental study, a general questionnaire examining health prevention behaviors and COVID-19 experience, geographical and cultural context characterization, and demographic information for each participant. Each participant started the study with the same general questions and then was randomized to complete either one longer experiment or two shorter experiments. Data were provided by 73,223 participants with varying completion rates. Participants completed the survey from 111 geopolitical regions in 44 unique languages/dialects. The anonymized dataset described here is provided in both raw and processed formats to facilitate re-use and further analyses. The dataset offers secondary analytic opportunities to explore coping, framing, and self-determination across a diverse, global sample obtained at the onset of the COVID-19 pandemic, which can be merged with other time-sampled or geographic data.
  16. Alymann AA, Alymann IA, Ong SQ, Rusli MU, Ahmad AH, Salim H
    Sci Data, 2024 Apr 05;11(1):337.
    PMID: 38580692 DOI: 10.1038/s41597-024-03172-9
    Reliable sex identification in Varanus salvator traditionally relied on invasive methods like genetic analysis or dissection, as less invasive techniques such as hemipenes inversion are unreliable. Given the ecological importance of this species and skewed sex ratios in disturbed habitats, a dataset that allows ecologists or zoologists to study the sex determination of the lizard is crucial. We present a new dataset containing morphometric measurements of V. salvator individuals from the skin trade, with sex confirmed by dissection post- measurement. The dataset consists of a mixture of primary and secondary data such as weight, skull size, tail length, condition etc. and can be used in modelling studies for ecological and conservation research to monitor the sex ratio of this species. Validity was demonstrated by training and testing six machine learning models. This dataset has the potential to streamline sex determination, offering a non-invasive alternative to complement existing methods in V. salvator research, mitigating the need for invasive procedures.
Related Terms
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links