Table of Contents
Fetching ...

Deep Learning for Forensic Identification of Source

Cole Patten, Christopher Saunders, Michael Puthawala

TL;DR

This work addresses forensic source identification under a common-but-unknown source setting for cartridge casings by learning similarity scores with contrastive neural networks. Trained on the E3 dataset and evaluated on NBIDE, the method achieves a $ROC AUC$ of $0.892$, exceeding the CMC baseline at $0.867$, and shows robustness across architectural ablations. The findings suggest that contrastive learning can more efficiently support evidence interpretation and may reduce calibration needs, while highlighting avenues for incorporating additional impressions and larger datasets to further improve performance. Practically, this approach offers faster similarity computations and the potential to augment traditional forensic methods with learned similarity scores.

Abstract

We used contrastive neural networks to learn useful similarity scores between the 144 cartridge casings in the NBIDE dataset, under the common-but-unknown source paradigm. The common-but-unknown source problem is a problem archetype in forensics where the question is whether two objects share a common source (e.g. were two cartridge casings fired from the same firearm). Similarity scores are often used to interpret evidence under this paradigm. We directly compared our results to a state-of-the-art algorithm, Congruent Matching Cells (CMC). When trained on the E3 dataset of 2967 cartridge casings, contrastive learning achieved an ROC AUC of 0.892. The CMC algorithm achieved 0.867. We also conducted an ablation study where we varied the neural network architecture; specifically, the network's width or depth. The ablation study showed that contrastive network performance results are somewhat robust to the network architecture. This work was in part motivated by the use of similarity scores attained via contrastive learning for standard evidence interpretation methods such as score-based likelihood ratios.

Deep Learning for Forensic Identification of Source

TL;DR

This work addresses forensic source identification under a common-but-unknown source setting for cartridge casings by learning similarity scores with contrastive neural networks. Trained on the E3 dataset and evaluated on NBIDE, the method achieves a of , exceeding the CMC baseline at , and shows robustness across architectural ablations. The findings suggest that contrastive learning can more efficiently support evidence interpretation and may reduce calibration needs, while highlighting avenues for incorporating additional impressions and larger datasets to further improve performance. Practically, this approach offers faster similarity computations and the potential to augment traditional forensic methods with learned similarity scores.

Abstract

We used contrastive neural networks to learn useful similarity scores between the 144 cartridge casings in the NBIDE dataset, under the common-but-unknown source paradigm. The common-but-unknown source problem is a problem archetype in forensics where the question is whether two objects share a common source (e.g. were two cartridge casings fired from the same firearm). Similarity scores are often used to interpret evidence under this paradigm. We directly compared our results to a state-of-the-art algorithm, Congruent Matching Cells (CMC). When trained on the E3 dataset of 2967 cartridge casings, contrastive learning achieved an ROC AUC of 0.892. The CMC algorithm achieved 0.867. We also conducted an ablation study where we varied the neural network architecture; specifically, the network's width or depth. The ablation study showed that contrastive network performance results are somewhat robust to the network architecture. This work was in part motivated by the use of similarity scores attained via contrastive learning for standard evidence interpretation methods such as score-based likelihood ratios.

Paper Structure

This paper contains 22 sections, 3 equations, 56 figures.

Figures (56)

  • Figure 1: Graph depicting state-of-the-art classification accuracy of neural networks on the ImageNet database. (Source: paperswithcode)
  • Figure 2: Image showing the A) firing pin impression, B) ejector mark, and C) breech face impression. Although all regions of the cartridge casing may be utilized as forensic evidence, the CMC algorithm is designed to operate exclusively on the breech face. Therefore, the breech face of the cartridge casing is the only region considered in this work. (Source: vorburger2007surface)
  • Figure 3: (Left) The shared architecture of all neural networks used in the study. Models varied in the composition of their residual blocks and specific choices of $n$. (Right) The first 5x5 residual block in our top-performing Reference Model. In both diagrams, the bubbles to the right of the model show the dimensions of a cartridge casing scan as it passes through the network.
  • Figure 4: Raw data may exist strewn about in its ambient data space. For example, 224x224 images can be thought of as existing in $\mathbb{R}^{224\times224}$. In this space, images and dogs and cats may not be easily separated. The goal of contrastive learning is to learn an embedding function, $f$, that maps the data from its chaotic ambient space to an embedding space where there is more order. In the dogs in cats examples, the increased order we want is for images of dogs and cats to exist in their own distinct clusters.
  • Figure 6: Each model configuration was trained for 20,000 epochs on the E3 dataset. Every 20 epochs, the ROC AUC was recorded. A sliding average was taken over every 200 epochs, and the maximum attained average is reported here. For each model configuration, this process was repeated for a total of 5 runs.
  • ...and 51 more figures

Theorems & Definitions (2)

  • Definition 1: SupCon Loss
  • Definition 2: ROC AUC