When children are victims of crimes, the legal testimony they provide is known as forensic interviews. However, since victims are often traumatized and potentially abused by their caregivers they can be reluctant to come forward with accusations or disclose relevant information.
As such, a protocol has been developed to carefully extract as much relevant information about a crime as possible. Yet, what if artificial intelligence could be a useful tool to help young victims tell their stories? What if AI could support interviewers with tools to help gather information in an appropriate manner?
This is the topic of a paper presented at the 2018 ACM International Conference on Multimodal Interaction, recently in Boulder, Colorado.
The paper, presented by doctoral students from the USC Viterbi School of Engineering Signal Analysis and Interpretation Laboratory, Victor Ardulov and Manojkumar Prabakaran Abitha, along with SAIL founder Shri Narayanan, documents a multidisciplinary effort in conjunction with USC Gould School of Law professor, and child witness expert, Thomas D. Lyon and his team, to determine if and how computer-aided tools can accurately assess the productivity of forensic interviews. In addition, the paper documents how researchers attempted to identify potential linguistic and paralinguistic influences such as emotions in the interview process.
Ardulov, who is the lead author of the paper presented at the recent ACM conference, said the purpose of the study was to garner feedback about how children tend to reply based on subtle variations in questioning.
The challenge for forensic interviewers is asking the right questions, in the right manner, at the right time in order to ensure that victims are forthcoming with relevant and unbiased information about perpetrated crimes. This is particularly important when children might be the sole witness to a crime. The key is to maximize productivity without re-traumatizing the child or coercing an inaccurate testimony.
Scholars such as Lyon, who established USC’s Gould Child Interviewing Lab, are aware how the rapport built among interviewer and interviewee, the tone in which questions are asked, pauses and even question order can impact how much meaningful information is shared. However, this is believed to be the first attempt to develop and apply custom software to automatically detect and categorize speech patterns in the course of the forensic interviews.
For over two decades, Narayanan has been developing speech and language technologies for understating the speech and language of children, and in developing award-winning AI-based conversational interfaces for children. He says “… linguistically-informed data science and computational techniques offer a rich set of tools for helping understand not only what a child is trying to communicate but their emotional and cognitive state of being. These are the technologies our [SAIL] lab at USC is trying to develop with our collaborators.”
Viterbi’s Narayanan met Gould’s Lyon about a decade ago at multidisciplinary collaborative workshop among USC professors. The two only started to work on this project about a year and half ago with Narayanan’s doctoral candidates Ardulov and Manoj Kumar taking the lead on finding ways to quantify particular factors in speech that could affect the output of the interview such as the frequency or length of an interviewer’s pauses, the time allotted for a child to respond, and the extent to which the pace of the interviewer’s speech mirrors the speech of the child being interviewed.
Lyon became interested in Narayanan’s work with the expectation that “technology can pick up on subtleties of an interview–qualities, that are harder to pick up and count.”
Findings of the Presented Paper
The anonymous audio transcripts of two hundred forensic interviews that Lyon collected from child abuse cases were transcribed from audio files and then coded for a variety of dimensions. The researchers from the SAIL Lab, which has previously developed tools to automatically analyze speech (such as who spoke and for how long) and rich behavioral aspects (such as emotions), as well as how people interact with one another, developed custom models for each interview. Once this was done, the researchers would then look for patterns in the interviews and in the interaction between the interviewer and the interviewee.
In general, the findings by the researchers are consistent with previous studies in the field of legal psychology. Interviews are normally conducted in two phases–a rapport-building phase unrelated to the crime or abuse, and then a second interview focused on the alleged abuse. In this study, the way that children in these interviews responded was highly correlated to their age. For younger children, the emotional content of the interviewer’s words had an impact on how much information they were willing to share during the phase of the interview. Older children were more influenced by the way interviewer’s vocalized their words (the pitch and loudness).
The hope is that a computer aide for interviews could take various forms. First, it could be a means to train forensic interviewers–either by means of a virtual assistant that informs interviewers during an interaction, or as a simulated child interview.
Both of these approaches hinge on the availability of large datasets of question and answer interactions and rigorous mathematical models of how children respond and are influenced by interviewer inputs. It is akin, Ardulov says, to how Google autocompletes the phrases you input and offers suggestions based on the huge number of historical inputs.
Lyon imagines that these models could be great tools for those who work as child advocates. “It could provide additional information to structure and refine protocols,” he says.
Lyon says, “Imagine an automated transcription of an interview whereby an interviewer holding an iPad gets the highlighted words or phrases that might inform his/her next question and guide the interview.”
He adds that this would be a way for interviews not have to use notes, and the actual software could point out possible contradictions and inconsistencies.
In order to do this, the next phase of research would be to create more sophisticated models whereby the researchers look at specific interactions, or particular sequences of questions to understand what yields the most relevant information from a child.