First RADAR-CNS paper on acceptability of speech recording presented at major speech conference

The peer-reviewed study showed that RADAR-CNS participants were more comfortable completing a task that involved reading a script than a task that involved free speech and that individuals with more severe depression felt less comfortable recording their speech.

The use of smartphones to record people’s speech is gaining interest as a form of remote measurement technology to collect valuable data that can be used in research and to inform healthcare.

Major depressive disorder (MDD) is a mood disorder that causes a persistent feeling of sadness and loss of interest and each year affects about 7% of people in Europe. Symptoms can fluctuate, and continuous monitoring could help patients and healthcare workers to better understand an individual’s experience and identify possible patterns and risk factors for the worsening of symptoms.

Surveying acceptability

The study, which was presented at the conference by first author Dr Jude Dineley (view recording), used a survey to explore how comfortable people with MDD in Spain and the UK felt about completing two different speech tasks on a smartphone app. The first task asked participants to read out loud passages of text, and the second task prompted participants to talk about something they may be looking forward to in the following week. From these tasks, researchers can detect changes in various features of an individual’s speech such as pitch and pause rate which could potentially provide insight into their current mental state.

Survey links were sent to 384 participants and 209 people responded. The main aim of the survey was to assess the acceptance, facilitators and barriers of smartphone-based speech recording in order to evaluate its feasibility as a remote monitoring technology for people with MDD and to provide insight into how to refine the approach.

Study findings

The analysis showed that, overall, participants were more comfortable completing the scripted speech task than the free speech task. For both speech tasks, researchers found depression severity to be significant predictor of discomfort in completing the task. This could be due to the increased fatigue and impairment of thought processes in those with more severe depression making the task more challenging. The country where the participant lived also predicted level of comfort with those participants from Spain expressing more discomfort than those from the UK.

In total, 104 (50%) of survey responders reported encountering at least one barrier that prevented them from completing speech recordings on at least one occasion. The most commonly reported barriers were not seeing the app notifications for the speech tasks, low mood and forgetfulness.

Dr Jude Dineley, first author from the Chair of Embedded Intelligence for Health Care and Wellbeing, at the University of Augsburg said: “Our study is the first detailed look at the acceptability of smartphone-based speech recording amongst those with a formal diagnosis of major depressive disorder. The results provide an important insight into how well participants engage with the approach and what could be done to improve their experience. Ultimately, we want to improve design of future research and the quality of data we get from it.

“The next steps are to look deeper into these initial findings and expand our analysis.

We now also have survey responses from our MDD participants in the Netherlands and from our three MS study sites. This extra data is really valuable, as we can get a better insight into how acceptability compares in different countries and disorders, as well as for different levels of disorder severity.”

Dr Nicholas Cummins, Lecturer in AI for speech analysis from the Institute of Psychiatry, Psychology & Neuroscience, King’s College London said: “Understanding participant acceptance of recording speech samples is often overlooked in speech research. Taking these factors into account when designing collection protocols will help us gather larger amounts of high-quality data. The findings from this study, as well as ones from our expanded analysis, will therefore be really important in guiding how we design future speech m-Health studies.”

The Interspeech conference is the world’s largest technical conference focused on speech processing and application. This year its format will be hybrid, taking place online and physically in Brno in the Czech Republic. The peer-reviewed study was presented on Tuesday 31st August from 13:30-15:30 CEST (12:30-14:40 BST) in the virtual Diverse Modes of Speech Acquisition and Processing session and will be published in the proceedings of Interspeech 2021.

Dineley, J. et al. (2021) Preprint. Remote smartphone-based speech collection: acceptance and barriers in individuals with major depressive disorder

This article was originally posted on: https://www.radar-cns.org/newsroom/first-radar-cns-paper-acceptability-speech-recording-presented-major-speech-conference