Affiliations 

  • 1 Forensic Science Program, CODTIS, Faculty of Health Science, Universiti Kebangsaan Malaysia, Selangor, Malaysia
  • 2 Fire Investigation Laboratory, Fire Investigation Division, Fire and Rescue Department of Selangor, Selangor, Malaysia
Forensic Sci Res, 2023 Sep;8(3):249-255.
PMID: 38221967 DOI: 10.1093/fsr/owad031

Abstract

Fire debris analysis aims to detect and identify any ignitable liquid residues in burnt residues collected at a fire scene. Typically, the burnt residues are analysed using gas chromatography-mass spectrometry (GC-MS) and are manually interpreted. The interpretation process can be laborious due to the complexity and high dimensionality of the GC-MS data. Therefore, this study aims to compare the potential of classification and regression tree (CART) and naïve Bayes (NB) algorithms in analysing the pixel-level GC-MS data of fire debris. The data comprise 14 positive (i.e. fire debris with traces of gasoline) and 24 negative (i.e. fire debris without traces of gasoline) samples. The differences between the positive and negative samples were first inspected based on the mean chromatograms and scores plots of the principal component analysis technique. Then, CART and NB algorithms were independently applied to the GC-MS data. Stratified random resampling was applied to prepare three sets of 200 pairs of training and testing samples (i.e. split ratio of 7:3, 8:2, and 9:1) for estimating the prediction accuracies. Although both the positive and negative samples were hardly differentiated based on the mean chromatograms and scores plots of principal component analysis, the respective NB and CART predictive models produced satisfactory performances with the normalized GC-MS data, i.e. majority achieved prediction accuracy >70%. NB consistently outperformed CART based on the prediction accuracies of testing samples and the corresponding risk of overfitting except when evaluated using only 10% of samples. The accuracy of CART was found to be inversely proportional to the number of testing samples; meanwhile, NB demonstrated rather consistent performances across the three split ratios. In conclusion, NB seems to be much better than CART based on the robustness against the number of testing samples and the consistent lower risk of overfitting.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.