Displaying publications 1 - 20 of 105 in total

Abstract:
Sort:
  1. Appasamy SD, Hamdani HY, Ramlan EI, Firdaus-Raih M
    Nucleic Acids Res, 2016 Jan 4;44(D1):D266-71.
    PMID: 26553798 DOI: 10.1093/nar/gkv1186
    A major component of RNA structure stabilization are the hydrogen bonded interactions between the base residues. The importance and biological relevance for large clusters of base interactions can be much more easily investigated when their occurrences have been systematically detected, catalogued and compared. In this paper, we describe the database InterRNA (INTERactions in RNA structures database-http://mfrlab.org/interrna/) that contains records of known RNA 3D motifs as well as records for clusters of bases that are interconnected by hydrogen bonds. The contents of the database were compiled from RNA structural annotations carried out by the NASSAM (http://mfrlab.org/grafss/nassam) and COGNAC (http://mfrlab.org/grafss/cognac) computer programs. An analysis of the database content and comparisons with the existing corpus of knowledge regarding RNA 3D motifs clearly show that InterRNA is able to provide an extension of the annotations for known motifs as well as able to provide novel interactions for further investigations.
    Matched MeSH terms: Databases, Nucleic Acid*
  2. Ahmad M, Jung LT, Bhuiyan MA
    Comput Biol Med, 2016 Feb 1;69:144-51.
    PMID: 26773936 DOI: 10.1016/j.compbiomed.2015.12.017
    A coding measure scheme numerically translates the DNA sequence to a time domain signal for protein coding regions identification. A number of coding measure schemes based on numerology, geometry, fixed mapping, statistical characteristics and chemical attributes of nucleotides have been proposed in recent decades. Such coding measure schemes lack the biologically meaningful aspects of nucleotide data and hence do not significantly discriminate coding regions from non-coding regions. This paper presents a novel fuzzy semantic similarity measure (FSSM) coding scheme centering on FSSM codons׳ clustering and genetic code context of nucleotides. Certain natural characteristics of nucleotides i.e. appearance as a unique combination of triplets, preserving special structure and occurrence, and ability to own and share density distributions in codons have been exploited in FSSM. The nucleotides׳ fuzzy behaviors, semantic similarities and defuzzification based on the center of gravity of nucleotides revealed a strong correlation between nucleotides in codons. The proposed FSSM coding scheme attains a significant enhancement in coding regions identification i.e. 36-133% as compared to other existing coding measure schemes tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms.
    Matched MeSH terms: Databases, Nucleic Acid*
  3. Emrizal R, Hamdani HY, Firdaus-Raih M
    Int J Mol Sci, 2021 Aug 09;22(16).
    PMID: 34445259 DOI: 10.3390/ijms22168553
    The increasing number and complexity of structures containing RNA chains in the Protein Data Bank (PDB) have led to the need for automated structure annotation methods to replace or complement expert visual curation. This is especially true when searching for tertiary base motifs and substructures. Such base arrangements and motifs have diverse roles that range from contributions to structural stability to more direct involvement in the molecule's functions, such as the sites for ligand binding and catalytic activity. We review the utility of computational approaches in annotating RNA tertiary base motifs in a dataset of PDB structures, particularly the use of graph theoretical algorithms that can search for such base motifs and annotate them or find and annotate clusters of hydrogen-bond-connected bases. We also demonstrate how such graph theoretical algorithms can be integrated into a workflow that allows for functional analysis and comparisons of base arrangements and sub-structures, such as those involved in ligand binding. The capacity to carry out such automatic curations has led to the discovery of novel motifs and can give new context to known motifs as well as enable the rapid compilation of RNA 3D motifs into a database.
    Matched MeSH terms: Databases, Nucleic Acid*
  4. Axtner J, Crampton-Platt A, Hörig LA, Mohamed A, Xu CCY, Yu DW, et al.
    Gigascience, 2019 Apr 01;8(4).
    PMID: 30997489 DOI: 10.1093/gigascience/giz029
    BACKGROUND: The use of environmental DNA for species detection via metabarcoding is growing rapidly. We present a co-designed lab workflow and bioinformatic pipeline to mitigate the 2 most important risks of environmental DNA use: sample contamination and taxonomic misassignment. These risks arise from the need for polymerase chain reaction (PCR) amplification to detect the trace amounts of DNA combined with the necessity of using short target regions due to DNA degradation.

    FINDINGS: Our high-throughput workflow minimizes these risks via a 4-step strategy: (i) technical replication with 2 PCR replicates and 2 extraction replicates; (ii) using multi-markers (12S,16S,CytB); (iii) a "twin-tagging," 2-step PCR protocol; and (iv) use of the probabilistic taxonomic assignment method PROTAX, which can account for incomplete reference databases. Because annotation errors in the reference sequences can result in taxonomic misassignment, we supply a protocol for curating sequence datasets. For some taxonomic groups and some markers, curation resulted in >50% of sequences being deleted from public reference databases, owing to (i) limited overlap between our target amplicon and reference sequences, (ii) mislabelling of reference sequences, and (iii) redundancy. Finally, we provide a bioinformatic pipeline to process amplicons and conduct PROTAX assignment and tested it on an invertebrate-derived DNA dataset from 1,532 leeches from Sabah, Malaysia. Twin-tagging allowed us to detect and exclude sequences with non-matching tags. The smallest DNA fragment (16S) amplified most frequently for all samples but was less powerful for discriminating at species rank. Using a stringent and lax acceptance criterion we found 162 (stringent) and 190 (lax) vertebrate detections of 95 (stringent) and 109 (lax) leech samples.

    CONCLUSIONS: Our metabarcoding workflow should help research groups increase the robustness of their results and therefore facilitate wider use of environmental and invertebrate-derived DNA, which is turning into a valuable source of ecological and conservation information on tetrapods.

    Matched MeSH terms: Databases, Nucleic Acid*
  5. Low YY, Chin GJWL, Joseph CG, Musta B, Rodrigues KF
    Data Brief, 2020 Dec;33:106486.
    PMID: 33225029 DOI: 10.1016/j.dib.2020.106486
    The genomic data of four bacteria strains isolated from the abandoned Mamut Copper Mine, an Acid Mine Drainage (AMD) site is presented in this report. Two of these strains belong to the genus Bacillus, while the other two belong to the genus Pseudomonas. The draft genome size of Pseudomonas sp. strain MCMY3 was 6,396,595 bp (GC: 63.3%), Bacillus sp. strain MCMY6 was 6,815,573 bp (GC: 35.2%), Bacillus sp. strain MCMY13 was 5,559,059 bp (GC: 35.5%) and Pseudomonas sp. strain MCMY15 was 7,381,777 bp (GC: 64.8%). These four genomes contained 493, 495, 495 and 579 annotated subsystems, respectively. The sequence data are available at GenBank sequence read archive with accessions numbers SRX7859406, SRX7859404, SRX7859405 and SRX7293032 for strains MCMY3, MCMY6, MCMY13 and MCMY15, respectively.
    Matched MeSH terms: Databases, Nucleic Acid
  6. Lim LY, Ab Majid AH
    Data Brief, 2020 Aug;31:105903.
    PMID: 32637504 DOI: 10.1016/j.dib.2020.105903
    Tapinoma indicum is a household pest that is widely distributed in Asian countries. It is known as nuisance pest that causes annoyance and disturbance by constructing nests and foraging in building for food and water. This article documents the draft genome dataset of T. indicum collected in Penang Island, Malaysia using the next-generation sequencing known as the Illumina platform. This article presents the pair-end 150 bp genome dataset and the quality of the sequencing result. This dataset provides the information for further understanding of T. indicum in the molecular aspect and the opportunity to develop a novel method for pest control and regulation. The dataset is available under Sequence Read Archive (SRA) databases with the accession number SRR10848807.
    Matched MeSH terms: Databases, Nucleic Acid
  7. Zakaria MR, Lam MQ, Chen SJ, Abdul Karim MH, Tokiman L, Yahya A, et al.
    Data Brief, 2020 Jun;30:105658.
    PMID: 32426431 DOI: 10.1016/j.dib.2020.105658
    Mangrovimonas sp. strain CR14 is a halophilic bacterium affiliated with family Flavobacteriaceae which was successfully isolated from mangrove soil samples obtained from Tanjung Piai National Park, Johor. The whole genome of strain CR14 was sequenced on an Illumina HiSeq 2500 platform (2 × 150 bp paired end). Herein, we report the genome sequence of Mangrovimonas sp. strain CR14 in which its assembled genome consisted 20 contigs with a total size of 3,590,195 bp, 3209 coding sequences, and an average 36.08% G + C content. Genome annotation and gene mining revealed that this bacterium demonstrated proteolytic activity which could be potentially applied in detergent industry. This whole-genome shotgun data of Mangrovimonas sp. strain CR14 has been deposited at DDBJ/ENA/GenBank under the accession JAAFZY000000000. The version described in this paper is version JAAFZY010000000.
    Matched MeSH terms: Databases, Nucleic Acid
  8. Reijnen BT
    Zookeys, 2015.
    PMID: 25987877 DOI: 10.3897/zookeys.501.9144
    During fieldwork in Indonesia and Malaysia, eight lots containing 33 specimens belonging to the genus Crenavolva (Ovulidae) were collected. Species were initially identified as Crenavolvaaureola, Crenavolvachiapponii, Crenavolvastriatula and Crenavolvatrailli, respectively. For Crenavolvachiapponii this is the second record. In contrast to the ecological data available from the original description of this species, it was found in shallow water on a gorgonian host coral, i.e. Acanthogorgia sp. A molecular analysis based on COI and 16S mtDNA markers, including sequence data obtained from GenBank, showed that Crenavolvachiapponii should be considered a junior synonym of Crenavolvaaureola and that previously identified ovulid specimens are probably misidentified.
    Matched MeSH terms: Databases, Nucleic Acid
  9. Coetzee MP, Wingfield BD, Bloomer P, Ridley GS, Wingfield MJ
    Mycologia, 2003 Mar-Apr;95(2):285-93.
    PMID: 21156614
    Armillaria root rot is a serious disease, chiefly of woody plants, caused by many species of Armillaria that occur in temperate, tropical and subtropical regions of the world. Very little is known about Armillaria in South America and Southeast Asia, although Armillaria root rot is well known in these areas. In this study, we consider previously unidentified isolates collected from trees with symptoms of Armillaria root rot in Chile, Indonesia and Malaysia. In addition, isolates from basidiocarps resembling A. novae-zelandiae and A. limonea, originating from Chile and Argentina, respectively, were included in this study because their true identity has been uncertain. All isolates in this study were compared, based on their similarity in ITS sequences with previously sequenced Armillaria species, and their phylogenetic relationship with species from the Southern Hemisphere was considered. ITS sequence data for Armillaria also were compared with those available at GenBank. Parsimony and distance analyses were conducted to determine the phylogenetic relationships between the unknown isolates and the species that showed high ITS sequence similarity. In addition, IGS-1 sequence data were obtained for some of the species to validate the trees obtained from the ITS data set. Results of this study showed that the ITS sequences of the isolates obtained from basidiocarps resembling A. novae-zelandiae are most similar to those for this species. ITS sequences for isolates from Indonesia and Malaysia had the highest similarity to A. novae-zelandiae but were phylogenetically separated from this species. Isolates from Chile, for which basidiocarps were not found, were similar in their ITS and IGS-1 sequences to the isolate from Argentina that resembled A. limonea. These isolates, however, had the highest ITS and IGS-1 sequence similarity to authentic isolates of A. luteobubalina and were phylogenetically more closely related to this species than to A. limonea.
    Matched MeSH terms: Databases, Nucleic Acid
  10. Matra DD, Ritonga AW, Natawijaya A, Poerwanto R, Sobir, Widodo WD, et al.
    Data Brief, 2019 Feb;22:332-335.
    PMID: 30596128 DOI: 10.1016/j.dib.2018.12.031
    Baccaurea motleyana Müll. Arg. (rambai) is one of the underutilized fruit natives to Indonesia, Thailand, and Malaya Peninsula and it is mostly cultivated in Java island (Lim, 2012) [1]. The edible part of fruits is white and reddish arillodes in which having sweet to acid-sweet tastes. However, nucleotide as well as transcriptome information of this species is still scarce, no information has been deposited in GenBank. In this data article, we performed for the first time of de novo assembly of transcriptome using paired-end Illumina technology. The assembled contigs were constructed using Trinity and after filtering and clustering, produced 37,077 contigs. The contig ranged 201-4972 bp and N50 has 696 bp. The contig was annotated with several database such as SwissProt, TrEMBL, nr and nt NCBI databases. The raw reads were deposited in DDBJ with DRA numbers, DRA007358. The assembled contigs of transcriptome are deposited in the DDBJ TSA with accession number, IADP01000001-IADP01037077 and also can be accessed at http://rujakbase.id.
    Matched MeSH terms: Databases, Nucleic Acid
  11. Labrooy C, Abdullah TL, Stanslas J
    Data Brief, 2018 Dec;21:1678-1685.
    PMID: 30505900 DOI: 10.1016/j.dib.2018.10.097
    This study compared morphological and molecular data for identification of Kaempferia species. Each species was deposited in Institute of Bioscience (IBS), Universiti Putra Malaysia (UPM) as voucher specimens and ITS sequences of each species deposited in NCBI (https://www.ncbi.nlm.nih.gov/) as GenBank accessions. DNA was extracted using a modified CTAB method and PCR amplification was completed using Internal Transcribed Spacer (ITS4 and ITS5) markers. PCR amplification of products were viewed under gel electrophoresis. Sequencing was performed and sequence characteristics of ITS rDNA in Kaempferia is shown. Qualitative and qualitative scoring of morphological characters and measuring techniques for Kaempferia species are included. In addition, a brief review of molecular markers used in phylogenetic studies of Zingiberaceae is included in this dataset.
    Matched MeSH terms: Databases, Nucleic Acid
  12. Govender N, Senan S, Mohamed-Hussein ZA, Ratnam W
    Genom Data, 2017 Sep;13:11-14.
    PMID: 28626637 DOI: 10.1016/j.gdata.2017.05.008
    Shoot and inflorescence are central physiological and developmental tissues of plants. Flowering is one of the most important agronomic traits for improvement of crop yield. To analyze the vegetative to reproductive tissue transition in Jatropha curcas, gene expression profiles were generated from shoot and inflorescence tissues. RNA isolated from both tissues was sequenced using the Ilumina HiSeq 2500 platform. Differential gene expression analysis identified key biological processes associated with vegetative to reproductive tissue transition. The present data for J. curcas may inform the design of breeding strategies particularly with respect to reproductive tissue transition. The raw data of this study has been deposited in the NCBI's Sequence Read Archive (SRA) database with the accession number SRP090662.
    Matched MeSH terms: Databases, Nucleic Acid
  13. Samad AFA, Sajad M, Jani J, Murad AMA, Ismail I
    Data Brief, 2018 Oct;20:555-557.
    PMID: 30197911 DOI: 10.1016/j.dib.2018.08.034
    Degradome sequencing referred as parallel analysis of RNA ends (PARE) by modifying 5'-rapid amplification of cDNA ends (RACE) with deep sequencing method. Deep sequencing of 5' products allow the determination of cleavage sites through the mapping of degradome fragments against small RNAs (miRNA or siRNA) on a large scale. Here, we carried out degradome sequencing in medicinal plant, Persicaria minor, to identify cleavage sites in small RNA libraries in control (mock-inoculated) and Fusarium oxysporum treated plants. The degradome library consisted of both control and treated samples which were pooled together during library preparation and named as D4. The D4 dataset have been deposited at GenBank under accession number SRX3921398, https://www.ncbi.nlm.nih.gov/sra/SRX3921398.
    Matched MeSH terms: Databases, Nucleic Acid
  14. Teh KY, Afifudeen CLW, Aziz A, Wong LL, Loh SH, Cha TS
    Data Brief, 2019 Dec;27:104680.
    PMID: 31720332 DOI: 10.1016/j.dib.2019.104680
    Interest in harvesting potential benefits from microalgae renders it necessary to have the many ecological niches of a single species to be investigated. This dataset comprises de novo whole genome assembly of two mangrove-isolated microalgae (from division Chlorophyta); Chlorella vulgaris UMT-M1 and Messastrum gracile SE-MC4 from Universiti Malaysia Terengganu, Malaysia. Library runs were carried out with 2 × 150 base paired-ends reads, whereas sequencing was conducted using Illumina Novaseq 2500 platform. Sequencing yielded raw reads amounting to ∼11 Gb in total bases for both species and was further assembled de novo. Genome assembly resulted in a 50.15 Mbp and 60.83 Mbp genome size for UMT-M1 and SE-MC4, respectively. All filtered and assembled genomic data sequences have been submitted to National Centre for Biotechnology Information (NCBI) and can be located at DDBJ/ENA/GenBank under the accession of VJNP00000000 (UMT-M1) and VIYE00000000 (SE-MC4).
    Matched MeSH terms: Databases, Nucleic Acid
  15. Rovie-Ryan JJ, Gani M, Lee YP, Gan HM, Abdullah MT
    Data Brief, 2019 Aug;25:104058.
    PMID: 31211204 DOI: 10.1016/j.dib.2019.104058
    This data article presents the first complete mitochondrial genome (mitogenome) of an endangered slow loris subspecies, Nycticebus coucang insularis Robinson, 1917 from Tioman Island, Pahang. Once considered as extinct, an individual of the subspecies was captured alive from the island during the 2016 Biodiversity Inventory Programme as highlighted in the related research article entitled "Rediscovery of Nycticebus coucang insularis Robinson, 1917 (Primates: Lorisidae) at Tioman Island and its mitochondrial genetic assessment" Rovie-Ryan et al., 2018. Using MiSeq™ sequencing system, the entire mitogenome recovered is 16,765 bp in length, made up of 13 protein-coding genes, two rRNA genes, 22 tRNA genes, and one control region. The mitogenome has been deposited at DDBJ/EMBL/GenBank under the accession number NC_040292.1/MG515246.
    Matched MeSH terms: Databases, Nucleic Acid
  16. Abd Gani R, Manaf SM, Zafarina Z, Panneerchelvam S, Chambers GK, Norazmi MN, et al.
    Transfus Apher Sci, 2015 Aug;53(1):69-73.
    PMID: 25819336 DOI: 10.1016/j.transci.2015.03.009
    In this study we genotyped ABO, Rhesus, Kell, Kidd and Duffy blood group loci in DNA samples from 120 unrelated individuals representing four Malay subethnic groups living in Peninsular Malaysia (Banjar: n = 30, Jawa: n = 30, Mandailing: n = 30 and Kelantan: n = 30). Analyses were performed using commercial polymerase chain reaction-sequence specific primer (PCR-SSP) typing kits (BAG Health Care GmbH, Lich, Germany). Overall, the present study has successfully compiled blood group datasets for the four Malay subethnic groups and used the datasets for studying ancestry and health.
    Matched MeSH terms: Databases, Nucleic Acid*
  17. Mohd Salleh F, Ramos-Madrigal J, Peñaloza F, Liu S, Mikkel-Holger SS, Riddhi PP, et al.
    Gigascience, 2017 08 01;6(8):1-8.
    PMID: 28873965 DOI: 10.1093/gigascience/gix053
    Southeast (SE) Asia is 1 of the most biodiverse regions in the world, and it holds approximately 20% of all mammal species. Despite this, the majority of SE Asia's genetic diversity is still poorly characterized. The growing interest in using environmental DNA to assess and monitor SE Asian species, in particular threatened mammals-has created the urgent need to expand the available reference database of mitochondrial barcode and complete mitogenome sequences. We have partially addressed this need by generating 72 new mitogenome sequences reconstructed from DNA isolated from a range of historical and modern tissue samples. Approximately 55 gigabases of raw sequence were generated. From this data, we assembled 72 complete mitogenome sequences, with an average depth of coverage of ×102.9 and ×55.2 for modern samples and historical samples, respectively. This dataset represents 52 species, of which 30 species had no previous mitogenome data available. The mitogenomes were geotagged to their sampling location, where known, to display a detailed geographical distribution of the species. Our new database of 52 taxa will strongly enhance the utility of environmental DNA approaches for monitoring mammals in SE Asia as it greatly increases the likelihoods that identification of metabarcoding sequencing reads can be assigned to reference sequences. This magnifies the confidence in species detections and thus allows more robust surveys and monitoring programmes of SE Asia's threatened mammal biodiversity. The extensive collections of historical samples from SE Asia in western and SE Asian museums should serve as additional valuable material to further enrich this reference database.
    Matched MeSH terms: Databases, Nucleic Acid*
  18. Che Lah EF, Yaakop S, Ahamad M, Md Nor S
    Zookeys, 2015.
    PMID: 25685009 DOI: 10.3897/zookeys.478.8037
    Blood meal analysis (BMA) from ticks allows for the identification of natural hosts of ticks (Acari: Ixodidae). The aim of this study is to identify the blood meal sources of field collected on-host ticks using PCR analysis. DNA of four genera of ticks was isolated and their cytochrome b (Cyt b) gene was amplified to identify host blood meals. A phylogenetic tree was constructed based on data of Cyt b sequences using Neighbor Joining (NJ) and Maximum Parsimony (MP) analysis using MEGA 5.05 for the clustering of hosts of tick species. Twenty out of 27 samples showed maximum similarity (99%) with GenBank sequences through a Basic Local Alignment Search Tool (BLAST) while 7 samples only showed a similarity range of between 91-98%. The phylogenetic trees showed that the blood meal samples were derived from small rodents (Leopoldamyssabanus, Rattustiomanicus and Sundamysmuelleri), shrews (Tupaiaglis) and mammals (Tapirusindicus and Prionailurusbengalensis), supported by 82-88% bootstrap values. In this study, Cyt b gene as a molecular target produced reliable results and was very significant for the effective identification of ticks' blood meal. The assay can be used as a tool for identifying unknown blood meals of field collected on-host ticks.
    Matched MeSH terms: Databases, Nucleic Acid
  19. Ho WS, Pang SL, Abdullah J
    Physiol Mol Biol Plants, 2014 Jul;20(3):393-7.
    PMID: 25049467 DOI: 10.1007/s12298-014-0230-x
    The large-scale genomic resource for kelampayan was generated from a developing xylem cDNA library. A total of 6,622 high quality expressed sequence tags (ESTs) were generated through high-throughput 5' EST sequencing of cDNA clones. The ESTs were analyzed and assembled to generate 4,728 xylogenesis unigenes distributed in 2,100 contigs and 2,628 singletons. About 59.3 % of the ESTs were assigned with putative identifications whereas 40.7 % of the sequences showed no significant similarity to any sequences in GenBank. Interestingly, most genes involved in lignin biosynthesis and several other cell wall biosynthesis genes were identified in the kelampayan EST database. The identified genes in this study will be candidates for functional genomics and association genetic studies in kelampayan aiming at the production of high value forests.
    Matched MeSH terms: Databases, Nucleic Acid
  20. Mastor NN, Subbiah VK, Bakar WNWA, Begum K, Alam MJ, Hoque MZ
    Data Brief, 2020 Dec;33:106370.
    PMID: 33102652 DOI: 10.1016/j.dib.2020.106370
    Enterococcus gallinarum is a gram positive facultatively anaerobic bacteria that is typically found in mammalian intestinal tracts. It is generally not considered pathogenic to humans and is rarely reported. Here, we present the draft genome sequence data of Enterococcus gallinarum strain EGR748 isolated from a human clinical sample, and sequenced using the Illumina HiSeq 4000 system. The estimated whole genome size of the strain was 3,730,000 bp with a G + C content of 40.43%. The de novo assembly of the genome generated 55 contigs with an N50 of 208,509 bp. In addition, the Maximum Likelihood phylogenetic analysis based on the 16S rRNA sequence data accurately clustered EGR748 with other E. gallinarum strains. The data may be useful to demonstrate the capacity of this enterococcal species becoming the causal agents of nosocomial blood-stream infections. The genome dataset has been deposited at DDBJ/ENA/GenBank under the accession number JAABOR000000000.
    Matched MeSH terms: Databases, Nucleic Acid
Filters
Contact Us

Please provide feedback to Administrator (afdal@afpm.org.my)

External Links