Table of Contents
Fetching ...

Demonstrative Evidence and the Use of Algorithms in Jury Trials

Rachel Rogers, Susan VanderPlas

TL;DR

This work addresses how algorithmic bullet matching and demonstrative evidence influence juror perceptions of reliability, credibility, and understanding in forensic testimony. It implements a factorial online study varying examiner conclusion, algorithm testimony, and demonstrative evidence to measure effects on credibility, reliability, scientificity, understanding, probability judgments, and guilt decisions. A key finding is pervasive scale compression on Likert responses, which obscures potential effects of the algorithm and visuals, while examiner conclusions exert the strongest influence on judgment. The authors discuss design improvements and alternative response formats to better capture nuanced effects, aiming to guide the responsible integration of algorithmic evidence and statistics in courtroom settings.

Abstract

We investigate how the use of bullet comparison algorithms and demonstrative evidence may affect juror perceptions of reliability, credibility, and understanding of expert witnesses and presented evidence. The use of statistical methods in forensic science is motivated by a lack of scientific validity and error rate issues present in many forensic analysis methods. We explore what our study says about how this type of forensic evidence is perceived in the courtroom where individuals unfamiliar with advanced statistical methods are asked to evaluate results in order to assess guilt. In the course of our initial study, we found that individuals overwhelmingly provided high Likert scale ratings in reliability, credibility, and scientificity regardless of experimental condition. This discovery of scale compression - where responses are limited to a few values on a larger scale, despite experimental manipulations - limits statistical modeling but provides opportunities for new experimental manipulations which may improve future studies in this area.

Demonstrative Evidence and the Use of Algorithms in Jury Trials

TL;DR

This work addresses how algorithmic bullet matching and demonstrative evidence influence juror perceptions of reliability, credibility, and understanding in forensic testimony. It implements a factorial online study varying examiner conclusion, algorithm testimony, and demonstrative evidence to measure effects on credibility, reliability, scientificity, understanding, probability judgments, and guilt decisions. A key finding is pervasive scale compression on Likert responses, which obscures potential effects of the algorithm and visuals, while examiner conclusions exert the strongest influence on judgment. The authors discuss design improvements and alternative response formats to better capture nuanced effects, aiming to guide the responsible integration of algorithmic evidence and statistics in courtroom settings.

Abstract

We investigate how the use of bullet comparison algorithms and demonstrative evidence may affect juror perceptions of reliability, credibility, and understanding of expert witnesses and presented evidence. The use of statistical methods in forensic science is motivated by a lack of scientific validity and error rate issues present in many forensic analysis methods. We explore what our study says about how this type of forensic evidence is perceived in the courtroom where individuals unfamiliar with advanced statistical methods are asked to evaluate results in order to assess guilt. In the course of our initial study, we found that individuals overwhelmingly provided high Likert scale ratings in reliability, credibility, and scientificity regardless of experimental condition. This discovery of scale compression - where responses are limited to a few values on a larger scale, despite experimental manipulations - limits statistical modeling but provides opportunities for new experimental manipulations which may improve future studies in this area.
Paper Structure (19 sections, 11 figures, 6 tables)

This paper contains 19 sections, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Bullet signatures for two lands. The left image indicates two matching lands, while the right image indicates two non-matching lands.
  • Figure 2: Comparison grids demonstrating various bullet comparisons. The top two images were from the same source, and were used as the test fire (left) and the algorithmic identification (right) in the sample testimony. The bottom two images are from different sources, and were used as the algorithmic elimination (left) and inconclusive (right) in the sample testimony.
  • Figure 3: Screenshot of the study description on the first page of the Shiny app.
  • Figure 4: Histogram of perceived reliability of the exmaminer's firearm comparison across all study conditions. There is evidence of significant scale compression across conditions, suggesting that in order to be able to measure and statistically model differences in perception of examiner reliability, the transcripts must contain more information which might cause participants to question the reliability of the examiner and of firearms comparisons.
  • Figure 5: Histogram of examiner credibility by conclusion, demonstrative evidence, and algorithm conditions. Inconclusive conclusions have slightly lower credibility (particularly in the absence of demonstrative evidence), but overall, the primary observation when considering this data is that there is significant scale compression.
  • ...and 6 more figures