Researchers claim new AI can detect depression from a person’s speech

31 Aug 2018

Abstract image of a person's empty mind against a grey background with a neural network criss-crossing on it.

Image: GrAl/Shutterstock

A team of MIT researchers is claiming that its latest AI can accurately predict depression in a person by processing raw text and speech data.

Artificial intelligence (AI) is expected to have a major impact on medical diagnosis in the years and decades to come. One example is DeepMind’s latest system, which can spot more than 50 eye diseases by searching through scanned images of a patient’s eyes.

One area where there has been less development is mental health diagnosis, possibly because of the sheer complexity of the human mind.

However, in a paper presented at a recent conference, MIT researchers revealed a neural network model that they claim can take raw text and audio data compiled from interviews of people to discover speech patterns indicative of depression.

The team said that as a result, when presented with a new subject, it can accurately predict if that person is depressed, without needing to be asked any questions.

Picks up from natural conversation

The researchers hope that this system will allow AI developers to create tools to detect signs of depression in natural conversation.

One suggestion the team made would be that in the future, mobile apps could monitor a person’s text and voice conversations for mental distress, and send alerts.

Tuka Alhanai, first author of the research, said that it is through a person’s speech that we get our first hint of someone’s emotional state.

“If you want to deploy [depression-detection] models in a scalable way … you want to minimise the amount of constraints you have on the data you’re using,” she said.

“You want to deploy it in any regular conversation and have the model pick up, from the natural interaction, the state of the individual.”

To build the AI, the researchers worked from a dataset of 142 interactions from the Distress Analysis Interview Corpus containing audio, text and video interviews of patients with mental health issues, and virtual agents controlled by humans.

More difficult to detect from audio

Of all the subjects who took part in the study, 20pc were originally labelled as depressed. Under testing, the AI was able to accurately predict whether a person labelled as depressed was clinically diagnosed as such 71pc of the time.

One of the key findings from the research, the team said, was that the AI needed considerably more data to detect depression from audio than text. With text, it could detect depression within an average of seven question-answer sequences, compared with 30 when it analysed audio.

“That implies that the patterns in words people use that are predictive of depression happen in [a] shorter time span in text than in audio,” Alhanai said.

Looking to the future, aside from making the AI more accurate, the team said it wants to test it with other cognitive conditions such as dementia.