A study of about 500 expressed sequence tags (ESTs), derived from a merozoite cDNA library, was initiated as an approach to generate a larger pool of gene information on Eimeria tenella. Of the ESTs, 47.7% had matches with entries in the databases, including ribosomal proteins, metabolic enzymes and proteins with other functions, of which 14.3% represented previously known E. tenella genes. Thus over 50% of the ESTs had no significant database matches. The E. tenella EST dataset contained a range of highly abundant genes comparable with that found in the EST dataset of T. gondii and may thus reflect the importance of such molecules in the biology of the apicomplexan organisms. However, comparison of the two datasets revealed very few homologies between sequences of apical organelle molecules, and provides evidence for sequence divergence between these closely-related parasites. The data presented underpin the potential value of the EST strategy for the discovery of novel genes and may allow for a more rapid increase in the knowledge and understanding of gene expression in the merozoite life cycle stage of Eimeria spp.
The Asian seabass (Lates calcarifer) is one of the most economically important aquaculture fish species in South East Asia. While the biology of the Asian seabass is widely studied, relatively little information is available at the molecular level. This lack of molecular information represents one obstacle to rapid progress in the study of immune responses particularly under aquaculture conditions. In light of this situation, we have undertaken an expressed sequence tag (EST) project on the Asian seabass spleen, the secondary lymphoid organ, for the identification of immune-related genes. A
total of 2932 ESTs were generated and grouped into 1063 unique transcripts (UTs), which consisted of 104 consensi and 959 singletons. Of these, 51.3% (545/1063) matched to previously identified genes, while 48.7% (518/1063) showed no match. Of the 545 homologous UTs, 102 (9.6%) can be putatively identified as immune-related genes. The identification of the putative immune-related genes provides a meaningful framework in the effort to comprehend the Asian seabass immune system that may lead to an increase in the understanding of the defense mechanisms of and our abilities to
manage this fish species.
Clustering is a key step in the processing of Expressed Sequence Tags (ESTs). The primary goal of clustering is to put ESTs from the same transcript of a single gene into a unique cluster. Recent EST clustering algorithms mostly adopt the alignment-free distance measures, where they tend to yield acceptable clustering accuracies with reasonable computational time. Despite the fact that these clustering methods work satisfactorily on a majority of the EST datasets, they have a common weakness. They are prone to deliver unsatisfactory clustering results when dealing with ESTs from the genes derived from the same family. The root cause is the distance measures applied on them are not sensitive enough to separate these closely related genes.
Vanda Mimi Palmer (VMP) is a highly sought as fragrant-orchid hybrid in Malaysia. It is economically important in cosmetic and beauty industries and also a famous potted ornamental plant. To date, no work on fragrance-related genes of vandaceous orchids has been reported from other research groups although the analysis of floral fragrance or volatiles have been extensively studied. An expressed sequence tag (EST) resource was developed for VMP principally to mine any potential fragrance-related expressed sequence tag-simple sequence repeat (EST-SSR) for future development as markers in the identification of fragrant vandaceous orchids endemic to Malaysia. Clustering, annotation and assembling of the ESTs identified 1,196 unigenes which defined 966 singletons and 230 contigs. The VMP dbEST was functionally classified by gene ontology (GO) into three groups: molecular functions (51.2%), cellular components (16.4%) and biological processes (24.6%) while the remaining 7.8% showed no hits with GO identifier. A total of 112 EST-SSR (9.4%) was mined on which at least five units of di-, tri-, tetra-, penta-, or hexa-nucleotide repeats were predicted. The di-nucleotide motif repeats appeared to be the most frequent repeats among the detected SSRs with the AT/TA types as the most abundant among the dimerics, while AAG/TTC, AGA/TCT-type were the most frequent trimerics. The mined EST-SSR is believed to be useful in the development of EST-SSR markers that is applicable in the screening and characterization of fragrance-related transcripts in closely related species.
Green microalga Ankistrodesmus convolutus Corda is a fast growing alga which produces appreciable amount of carotenoids and polyunsaturated fatty acids. To our knowledge, this is the first report on the construction of cDNA library and preliminary analysis of ESTs for this species. The titers of the primary and amplified cDNA libraries were 1.1×10(6) and 6.0×10(9) pfu/ml respectively. The percentage of recombinants was 97% in the primary library and a total of 337 out of 415 original cDNA clones selected randomly contained inserts ranging from 600 to 1,500 bps. A total of 201 individual ESTs with sizes ranging from 390 to 1,038 bps were then analyzed and the BLASTX score revealed that 35.8% of the sequences were classified as strong match, 38.3% as nominal and 25.9% as weak match. Among the ESTs with known putative function, 21.4% of them were found to be related to gene expression, 14.4% ESTs to photosynthesis, 10.9% ESTs to metabolism, 5.5% ESTs to miscellaneous, 2.0% to stress response, and the remaining 45.8% were classified as novel genes. Analysis of ESTs described in this paper can be an effective approach to isolate and characterize new genes from A. convolutus and thus the sequences obtained represented a significant contribution to the extensive database of sequences from green microalgae.
The Malaysian giant prawn is among the most commonly cultured species of the genus Macrobrachium. Stocks of giant prawns from four rivers in Peninsular Malaysia have been used for aquaculture over the past 25 years, which has led to repeated harvesting, restocking, and transplantation between rivers. Consequently, a stock improvement program is now important to avoid the depletion of wild stocks and the loss of genetic diversity. However, the success of such an improvement program depends on our knowledge of the genetic variation of these base populations. The aim of the current study was to estimate genetic variation and differentiation of these riverine sources using novel expressed sequence tag-microsatellite (EST-SSR) markers, which not only are informative on genetic diversity but also provide information on immune and metabolic traits. Our findings indicated that the tested stocks have inbreeding depression due to a significant deficiency in heterozygotes, and FIS was estimated as 0.15538 to 0.31938. An F-statistics analysis suggested that the stocks are composed of one large panmictic population. Among the four locations, stocks from Johor, in the southern region of the peninsular, showed higher allelic and genetic diversity than the other stocks. To overcome inbreeding problems, the Johor population could be used as a base population in a stock improvement program by crossing to the other populations. The study demonstrated that EST-SSR markers can be incorporated in future marker assisted breeding to aid the proper management of the stocks by breeders and stakeholders in Malaysia.
Common beans (Phaseolus vulgaris L.) are widely consumed as a source of proteins and natural products. However, its yield needs to be increased. In line with the agenda of Phaseomics (an international consortium), work of expressed sequence tags (ESTs) generation from bean pods was initiated. Altogether, 5972 ESTs have been isolated. Alcohol dehydrogenase (AD) encoding gene cDNA was a noticeable transcript among the generated ESTs. This AD is an important enzyme; therefore, to understand more about it this study was undertaken.
Functional genomics has proven to be an efficient tool in identifying genes involved in various biological functions. However the availability of commercially important seaweed Eucheuma denticulatum functional resources is still limited. EuDBase is the first seaweed online repository that provides integrated access to ESTs of Eucheuma denticulatum generated from samples collected from Kudat and Semporna in Sabah, Malaysia. The database stored 10,031 ESTs that are clustered and assembled into 2,275 unique transcripts (UT) and 955 singletons. Raw data were automatically processed using ESTFrontier, an in-house automated EST analysis pipeline. Data was collected in MySQL database. Web interface is implemented using PHP and it allows browsing and querying EuDBase through search engine. Data is searchable via BLAST hit, domain search, Gene Ontology or KEGG Pathway. A user-friendly interface allows the identification of sequences either using a simple text query or similarity search. The development of EuDBase is initiated to store, manage and analyze the E. denticulatum ESTs and to provide accumulative digital resources for the use of global scientific community. EuDBase is freely available from http://www.inbiosis.ukm.my/eudbase/.
Simple sequence repeats (SSRs) derived from expressed sequence tags (ESTs) are valuable markers because they represent transcribed regions and often transferable to related taxa. Here, we report the development and characterization of EST-SSRs from Shorea leprosula. Fifty-four sequences containing SSRs were identified in 2003 unigenes assembled from 3159 ESTs. Twenty-four EST-SSRs were developed, of which four gave multiple amplifications, five were found to be monomorphic and 15 showed polymorphism, with allele numbers ranging from two to 17 in a single Pasoh Forest Reserve population of 24 individuals. The observed and expected heterozygosities ranged from 0.05 to 0.91 and from 0.16 to 0.93, respectively. Cross-species transferability of the 15 loci to 36 species within Dipterocarpaceae revealed between four and 14 loci that gave positive amplification and 10 loci were found to be transferable to more than 15 species.
Mekanisme pengambilan dan penghasilan asid amino bagi mikroorganisma psikrofil yang bermandiri dan berpoliferasi
pada persekitaran sejuk melampau masih belum difahami sepenuhnya. Objektif kajian ini ialah untuk mengenal pasti
gen yang terlibat dalam penjanaan asid amino bagi yis psikrofil, Glaciozyma antarctica serta menentukan pengekspresan
gen tersebut semasa kehadiran dan kekurangan asid amino dalam medium pertumbuhan. Pengenalpastian gen telah
dilakukan melalui penjanaan penanda jujukan terekspres (ESTs) daripada dua perpustakaan cDNA yang dibina daripada
sel yang dikultur dalam medium pertumbuhan kompleks dan medium pertumbuhan minimum tanpa asid amino. Sebanyak
3552 klon cDNA daripada setiap perpustakaan dipilih secara rawak untuk dijujuk menghasilkan 1492 transkrip unik
(medium kompleks) dan 1928 transkrip unik (medium minimum). Analisis pemadanan telah mengenl pasti gen mengekod
protein yang terlibat di dalam pengambilan asid amino bebas, biosintesis asid amino serta gen yang terlibat dengan
kitar semula asid amino berdasarkan tapak jalan yang digunakan oleh yis model, Saccharomyces cerevisiae. Analisis
pengekspresan gen menggunakan kaedah RT-qPCR menunjukkan pengekspresan gen mengekod protein yang terlibat di
dalam pengambilan asid amino bebas iaitu permease adalah tinggi pada medium kompleks manakala pengekspresan
kebanyakan gen mengekod protein yang terlibat dalam kitar semula dan biosintesis asid amino adalah tinggi di dalam
medium minimum. Kesimpulannya, gen yang terlibat dalam penjanaan dan pengambilan asid amino bagi mikroorganisma
psikrofil adalah terpulihara seperti mikroorganisma mesofil dan pengekspresan gen-gen ini adalah diaruh oleh kehadiran
atau ketiadaan asid amino bebas pada persekitaran.
Kitin merupakan polisakarida struktur yang dapat dicurai oleh enzim kitinolisis kepada pelbagai terbitan yang boleh digunakan dalam bidang perubatan, pertanian dan rawatan air. Pengenalpastian dan pencirian gen-gen Trichoderma virens UKM1 mengekod enzim terlibat dalam pencuraian kitin krustasea telah dilakukan melalui penjanaan penanda jujukan terekspres (ESTs) dan analisis pengekspresan gen menggunakan mikroatur DNA. Sebanyak tiga perpustakaan cDNA T. virens UKM1 yang masing-masing diaruh oleh kitin, glukosamina dan kitosan telah dibina. Sejumlah 1536 klon cDNA telah dijujuk dan sebanyak 1033 ESTs berkualiti telah dijana. Seterusnya, perbezaan pengekspresan gen apabila pertumbuhan kulat diaruh dengan kehadiran kitin krustasea dan tanpa kitin pada hari ketiga dan kelima telah ditentukan. Sebanyak 1824 klon cDNA telah dititik ke atas slaid kaca dan dihibrid bersama dengan cDNA terlabel Cy3 atau Cy5 yang disintesis daripada mRNA yang dipencil daripada kulat yang ditumbuhkan dalam medium mengandungi kitin krustasea atau glukosa (kawalan). Sebanyak 91 dan 61 gen, masing-masing bagi hari ketiga dan kelima didapati terekspres melebihi dua gandaan apabila kulat menggunakan kitin krustasea sebagai sumber karbon. Beberapa gen mengekod kitinase seperti ech1 dan cht3 (endokitinase), nag1 (eksokitinase) dan nagB (glukosamina 6-P-deaminase) didapati terekspres dengan tinggi pada kedua-dua hari. Selain daripada itu, gen mengekod protein hidrofobin, protease serina dan beberapa protein hipotetik juga terekspres dengan tinggi dengan kehadiran kitin krustasea. Protein-protein ini dijangka memainkan peranan penting dalam membantu pencuraian kitin krustasea.
P. minus is an aromatic plant, the leaf of which is widely used as a food additive and in the perfume industry. The leaf also accumulates secondary metabolites that act as active ingredients such as flavonoid. Due to limited genomic and transcriptomic data, the biosynthetic pathway of flavonoids is currently unclear. Identification of candidate genes involved in the flavonoid biosynthetic pathway will significantly contribute to understanding the biosynthesis of active compounds. We have constructed a standard cDNA library from P. minus leaves, and two normalized full-length enriched cDNA libraries were constructed from stem and root organs in order to create a gene resource for the biosynthesis of secondary metabolites, especially flavonoid biosynthesis. Thus, large-scale sequencing of P. minus cDNA libraries identified 4196 expressed sequences tags (ESTs) which were deposited in dbEST in the National Center of Biotechnology Information (NCBI). From the three constructed cDNA libraries, 11 ESTs encoding seven genes were mapped to the flavonoid biosynthetic pathway. Finally, three flavonoid biosynthetic pathway-related ESTs chalcone synthase, CHS (JG745304), flavonol synthase, FLS (JG705819) and leucoanthocyanidin dioxygenase, LDOX (JG745247) were selected for further examination by quantitative RT-PCR (qRT-PCR) in different P. minus organs. Expression was detected in leaf, stem and root. Gene expression studies have been initiated in order to better understand the underlying physiological processes.
This study reports on the detection of additional expressed sequence tags (EST) derived simple sequence repeat (SSR) markers for the oil palm. A large collection of 19243 Elaeis guineensis ESTs were assembled to give 10258 unique sequences, of which 629 ESTs were found to contain 722 SSRs with a variety of motifs. Dinucleotide repeats formed the largest group (45.6%) consisting of 66.9% AG/CT, 21.9% AT/AT, 10.9% AC/GT and 0.3% CG/CG motifs. This was followed by trinucleotide repeats, which is the second most abundant repeat types (34.5%) consisting of AAG/CTT (23.3%), AGG/CCT (13.7%), CCG/CGG (11.2%), AAT/ATT (10.8%), AGC/GCT (10.0%), ACT/AGT (8.8%), ACG/CGT (7.6%), ACC/GGT (7.2%), AAC/GTT (3.6%) and AGT/ACT (3.6%) motifs. Primer pairs were designed for 405 unique EST-SSRs and 15 of these were used to genotype 105 E. guineensis and 30 E. oleifera accessions. Fourteen SSRs were polymorphic in at least one germplasm revealing a total of 101 alleles. The high percentage (78.0%) of alleles found to be specific for either E. guineensis or E. oleifera has increased the power for discriminating the two species. The estimates of genetic differentiation detected by EST-SSRs were compared to those reported previously. The transferability across palm taxa to two Cocos nucifera and six exotic palms is also presented. The polymerase chain reaction (PCR) products of three primer-pairs detected in E. guineensis, E. oleifera, C. nucifera and Jessinia bataua were cloned and sequenced. Sequence alignments showed mutations within the SSR site and the flanking regions. Phenetic analysis based on the sequence data revealed that C. nucifera is closer to oil palm compared to J. bataua; consistent with the taxanomic classification.
Using a novel library of 5637 expressed sequence tags (ESTs) from the brain tissue of the Asian seabass (Lates calcarifer), we first characterized the brain transcriptome for this economically important species. The ESTs generated from the brain of L. calcarifer yielded 2410 unique transcripts (UTs) which comprise of 982 consensi and 1428 singletons. Based on database similarity, 1005 UTs (41.7%) can be assigned putative functions and were grouped into 12 functional categories related to the brain function. Amongst others, we have identified genes that are putatively involved in energy metabolism, ion pumps and channels, synapse related genes, neurotransmitter and its receptors, stress induced genes and hormone related genes. Subsequently we selected a putative preprocGnRH-II precursor for further characterization. The complete cDNA sequence of the gene obtained was found to code for an 85-amino acid polypeptide that significantly matched preprocGnRH-II precursor sequences from other vertebrates, and possesses structural characteristics that are similar to that of other species, consisting of a signal peptide (23 residues), a GnRH decapeptide (10 residues), an amidation/proteolytic-processing signal (glycine-lysine-argine) and a GnRH associated peptide (GAP) (49 residues). Phylogenetic analysis showed that this putative L. calcarifer preprocGnRH-II sequence is a member of the subcohort Euteleostei and divergent from the sequences of the subcohort Otocephalan. These findings provide compelling evidence that the putative L. calcarifer preprocGnRH-II precursor obtained in this study is orthologous to that of other vertebrates. The functional prediction of this preprocGnRH-II precursor sequence through in silico analyses emphasizes the effectiveness of the EST approach in gene identification in L. calcarifer.
The protozoan parasite Eimeria tenella has a complex life cycle that includes two major asexual developmental stages, the merozoite and the sporozoite. The expressed sequence tag (EST) approach has been previously used to study gene expression of merozoites. We report here the generation and analysis of 556 ESTs from sporozoites. Comparative analyses of the two datasets reveal a number of transcripts that are preferentially expressed in a specific stage, including previously uncharacterised sequences. The data presented indicate the invaluable potential of the comparative EST analysis for providing information on gene expression patterns in the different developmental stages of E. tenella.
Common bean (Phaseolus vulgaris L.) is an important part of the human diet and serves as a source of natural products. Identification and understanding of genes in P. vulgaris is important for its improvement. Characterization of expressed sequence tags (ESTs) is one of the approaches in understanding the expressed genes. For the understanding of genes expression in P. vulgaris pod-tissue, research work of ESTs generation was initiated by constructing cDNA libraries using 5-day and 20-day old bean-pod-tissues. Altogether, 5972 cDNA clones were isolated to have ESTs. While processing ESTs, we found a transcript for calmodulin (CaM) gene. It is an important gene that encodes for a calcium-binding protein and known to express in all eukaryotic cells. Hence, this study was undertaken to analyse and annotate it.
Pangolins (order Pholidota) are the only mammals covered by scales. We have recently sequenced and analyzed the genomes of two critically endangered Asian pangolin species, namely the Malayan pangolin (Manis javanica) and the Chinese pangolin (Manis pentadactyla). These complete genome sequences will serve as reference sequences for future research to address issues of species conservation and to advance knowledge in mammalian biology and evolution. To further facilitate the global research effort in pangolin biology, we developed the Pangolin Genome Database (PGD), as a future hub for hosting pangolin genomic and transcriptomic data and annotations, and with useful analysis tools for the research community. Currently, the PGD provides the reference pangolin genome and transcriptome data, gene sequences and functional information, expressed transcripts, pseudogenes, genomic variations, organ-specific expression data and other useful annotations. We anticipate that the PGD will be an invaluable platform for researchers who are interested in pangolin and mammalian research. We will continue updating this hub by including more data, annotation and analysis tools particularly from our research consortium.Database URL: http://pangolin-genome.um.edu.my.
We previously identified an expressed sequence tag clone, Der f 22, showing 41% amino acid identity to published Der f 2, and show that both genes are possible paralogues. The objective of this study was to characterize the genomic, proteomic and immunological functions Der f 22 and Der f 2. The full-length sequence of Der f 2 and Der f 22 coded for mature proteins of 129 and 135 amino acids respectively, both containing 6 cysteine residues. Phylogenetic analysis of known group 2 allergens and their homologues from our expressed sequence tag library showed that Der f 22 is a paralogue of Der f 2. Both Der f 2 and Der f 22 were single gene products with one intron. Both allergens showed specific IgE-binding to over 40% of the atopic patients, with limited of cross-reactivity. Both allergens were detected at the gut region of D. farinae by immunostaining. Der f 22 is an important allergen with significant IgE reactivity among the atopic population, and should be considered in the diagnostic panel and evaluated as future hypoallergen vaccine therapeutic target.
Improving the quality of the non-climacteric fruit, pineapple, is possible with information on the expression of genes that occur during the process of fruit ripening. This can be made known though the generation of partial mRNA transcript sequences known as expressed sequence tags (ESTs). ESTs are useful not only for gene discovery but also function as a resource for the identification of molecular markers, such as simple sequence repeats (SSRs). This paper reports on firstly, the construction of a normalized library of the mature green pineapple fruit and secondly, the mining of EST-SSRs markers using the newly obtained pineapple ESTs as well as publically available pineapple ESTs deposited in GenBank. Sequencing of the clones from the EST library resulted in 282 good sequences. Assembly of sequences generated 168 unique transcripts (UTs) consisting of 34 contigs and 134 singletons with an average length of ≈500 bp. Annotation of the UTs categorized the known proteins transcripts into the three ontologies as: molecular function (34.88%), biological process (38.43%), and cellular component (26.69%). Approximately 7% (416) of the pineapple ESTs contained SSRs with an abundance of trinucleotide SSRs (48.3%) being identified. This was followed by dinucleotide and tetranucleotide SSRs with frequency of 46 and 57%, respectively. From these EST-containing SSRs, 355 (85.3%) matched to known proteins while 133 contained flanking regions for primer design. Both the ESTs were sequenced and the mined EST-SSRs will be useful in the understanding of non-climacteric ripening and the screening of biomarkers linked to fruit quality traits.
Cryptocaryon irritans is a parasitic ciliate that causes cryptocaryonosis (white spot disease) in marine fish. Diagnosis of cryptocaryonosis often depends on the appearance of white spots on the surface of the fish, which are usually visible only during later stages of the disease. Identifying suitable biomarkers of this parasite would aid the development of diagnostic tools and control strategies for C. irritans. The C. irritans genome is virtually unexplored; therefore, we generated and analyzed expressed sequence tags (ESTs) of the parasite to identify genes that encode for surface proteins, excretory/secretory proteins and repeat-containing proteins.