Can we trust the Likelihood Ratio obtained with a black-box model?
A score-based likelihood ratio serves as a crucial statistical tool for evaluating the strength of evidence in forensic analysis scenario. By comparing observed data with the expected outcomes under different hypotheses, such as the accusatory and defensive hypotheses, it provides a quantitative assessment of the evidential support for each scenario.
In forensic analysis, likelihood ratio can be derived utilizing scores generated by deep-learning systems, which, being black-box models, lack transparency regarding their decision-making process. Despite employing statistical calibration techniques to refine these scores, the inherent opacity of black-box systems presents a challenge: we cannot ascertain precisely what aspects of the data drive the likelihood estimations. Consequently, explaining the rationale behind the obtained results to a judge becomes problematic.