Health
Episode 5 - AI & Facebook to screen for depression
In Episode 5 of The Intersection podcast, hosts Maxime and Jean explore a groundbreaking study on using Facebook data to screen for depression. They discuss the challenges of diagnosing major depressi...
Episode 5 - AI & Facebook to screen for depression
Health •
0:00 / 0:00
Interactive Transcript
spk_0
Welcome to the intersection, the podcast about artificial intelligence and healthcare.
spk_0
Hi everybody, this is Maxime.
spk_0
Thank you for tuning in today in episode five of the intersection podcast.
spk_0
We are discussing a PINAS paper titled Facebook Language, Critics, Depression in Medical
spk_0
Records.
spk_0
So I assume everybody knows about Facebook.
spk_0
So let's ask Jean, here with me today, to tell us a little bit about depression before
spk_0
talking about the machine learning that has been done and the results.
spk_0
Yeah, so hi Maxime.
spk_0
Well, what people call depression is actually called major depressive disorder medicine.
spk_0
And it's actually a pretty common and serious medical illness that can negatively affect
spk_0
you, how you feel, the way you think, how you act.
spk_0
So how is depression typically diagnosed today?
spk_0
What are the symptoms?
spk_0
Well, you have to look at a few symptoms and the diagnosis relies on the combination
spk_0
of symptoms, right?
spk_0
It's not because you have one of those that you are depressed, you need to have several
spk_0
of those and during a long period of time.
spk_0
I see.
spk_0
And how is it as said that I present those symptoms?
spk_0
Is that through medical recording or is it through kind of talking with a therapist and
spk_0
with a medical doctor?
spk_0
Well, there are scales, but basically the first step would be to discuss with your medical
spk_0
like your PCP and then with a psychiatrist, for example.
spk_0
But let's focus on symptoms first.
spk_0
Okay.
spk_0
So the symptoms include feeling sad or having a depressed mood.
spk_0
Having a loss of interest or pleasure in activities that you use to enjoy.
spk_0
Changes in appetite like weight loss or weight gain and related to dieting, travel sleeping
spk_0
or even sleeping too much.
spk_0
A loss of energy or increased fatigue.
spk_0
Increase in a purposeless phase of activity, for example pacing or slow movements and
spk_0
speech.
spk_0
So this is typically reported by other people, not yourself.
spk_0
Feeling worse less or guilty.
spk_0
Having difficulty thinking and even in the most extreme cases, having thoughts of
spk_0
death of suicide.
spk_0
So it's a whole spectrum of things.
spk_0
It's not only one of those symptoms.
spk_0
I see.
spk_0
And what I can think already that it might be quite difficult to actually screen for
spk_0
depression because for most of the symptoms that you describe right now, even myself that
spk_0
I don't consider myself depressed right now, I experience them on a most weekly basis.
spk_0
I assume the screening is quite challenging.
spk_0
Exactly.
spk_0
That's why it's important to keep in mind the concept of having several of those and
spk_0
also having several of those for at least two weeks in a row.
spk_0
So it's everyone, each of us has some like done and up times.
spk_0
So it's important that if these downs keep coming up and keep getting longer and longer,
spk_0
then you might need and seek help.
spk_0
I see.
spk_0
And also, so it means you have to go to the therapies, which is quite an important step, right?
spk_0
And then you have to kind of provide a nuclear rate assessment of whether you have
spk_0
experience of symptoms in the past two weeks, right?
spk_0
Which is not something that is necessarily trivial either because those are quite qualitative
spk_0
symptoms.
spk_0
Yeah, absolutely.
spk_0
And you also should keep in mind that some medical conditions, typically cyroad problems,
spk_0
brain tumor or vitamin deficiency, could actually mimic the symptoms of depression.
spk_0
So before you actually diagnose depression, you really need to rule out any other medical
spk_0
condition that could mimic that, right?
spk_0
So it's like, it's not a default diagnosis, but it's not the first diagnosis that you
spk_0
need to take.
spk_0
Thank you.
spk_0
And thus depression, I guess what is the scope of the disease?
spk_0
Well, it's actually a disease, right?
spk_0
That's how it's called.
spk_0
So it's a disease with very serious consequences and it's actually pretty frequent.
spk_0
So for example, one in six people, so that's basically 16%, with experience, depression
spk_0
at some point in their life.
spk_0
Okay, so one in six people basically means that if I consider a family, let's say a family
spk_0
unit of six people in every family unit, there is an average one person who is affected by
spk_0
that disorder.
spk_0
So it's a pretty, pretty important topic I assume.
spk_0
Right, and you should also remember that depression can strike at any time of your life,
spk_0
but on average, the first episodes appear during the late teens to the mid-20s.
spk_0
And unfortunately, women are also more likely than men to experience depression.
spk_0
I see.
spk_0
And so what can you do when you have depression?
spk_0
How important is it to be screened?
spk_0
And then I guess to be diagnosed on the outcome, right?
spk_0
Right, because depression is actually a very treatable illness, it's actually very important
spk_0
to get screened.
spk_0
Because once you get screened and diagnosed, then you can get the appropriate treatment,
spk_0
right?
spk_0
So treatments usually include psychotherapy, medication, or in the most extreme cases,
spk_0
electroconversive therapy.
spk_0
But there are also a number of things that people can do to help reduce the symptoms of
spk_0
depression.
spk_0
For example, for many people, regular exercise helps create positive feelings and improve
spk_0
mood, getting enough quality sleep on regular basis, eating a healthy diet, avoiding alcohol.
spk_0
All these stuff can help a lot in that regard.
spk_0
And machine learning is actually also used with all of these activities that you mentioned,
spk_0
right?
spk_0
Machine learning used to help set tracks the amount of calories in your diet, machine learning
spk_0
being used to make better sleeping recommendations and so on.
spk_0
But that's not a topic of today's article, but it's interesting to see, right, that the
spk_0
extent of the impact that machine learning can have for that particular problem.
spk_0
So let's focus on the scope of that article, which is basically screening.
spk_0
And it's extremely important in the case of depression because most people with depression
spk_0
do not get the care the need, even if treatment exists.
spk_0
For example, only 20% of people treated for depression have received what the National
spk_0
Institute for Mental Health defines as a minimally adequate treatment.
spk_0
So that leaves 80% of people not getting the right treatment.
spk_0
So I mean four out of five patients do not get the treatment?
spk_0
Because they are undereignosed or because they do not get simply even if they are diagnosed
spk_0
the right treatment.
spk_0
So hence the importance of screening better diagnoses and better treatments?
spk_0
That makes sense.
spk_0
And this is actually exactly what this article is trying to tackle, right?
spk_0
Everybody, I guess most of the people use Facebook.
spk_0
People are very active on Facebook.
spk_0
People post and through the post, they reveal things about the way they feel.
spk_0
They reveal things about their friends and so on.
spk_0
And so in fact, what this article is trying to do is to build a somewhat non-invasive
spk_0
or effortless prediction model of depression using contents from the Facebook profile of a given
spk_0
in video.
spk_0
So let me say a bit more about the inputs that this prediction model takes, right?
spk_0
So there are five inputs.
spk_0
One is the textual content of the Facebook post.
spk_0
So what do you write in your post?
spk_0
The second one is how long your posts are.
spk_0
The third one is how frequently do you post?
spk_0
The fourth one is is there a particular patterns in the way in which you post, right?
spk_0
More at night, more in the day and so on.
spk_0
And then the last thing which it doesn't really have to do with Facebook but is an additional
spk_0
kind of input is a demographics, right?
spk_0
So some variables about your background and so on.
spk_0
Okay, and so now we know the inputs.
spk_0
What is the actual output of the model?
spk_0
So this model outputs the probability between 0 and 1 for an individual to be experiencing
spk_0
depression based on this input.
spk_0
And the way that this model is trained, right?
spk_0
So we have inputs we have outputs is using supervised learning.
spk_0
So in other words, the parameters of the model are adjusted to maximize the accuracy of
spk_0
the model prediction on a cohort of patients for whom the presence or absence of depression
spk_0
has been previously established and is available in their medical records.
spk_0
Okay, and let me just maybe take that opportunity to discuss what is the difference between
spk_0
supervised and unsupervised learning?
spk_0
Yes, so that's an important point.
spk_0
So in supervised learning, we are basically training a model on a data set for which we know
spk_0
the input and we know the output and we are basically trying to learn a function that maps,
spk_0
let's say X to Y input X output.
spk_0
Right.
spk_0
In unsupervised learning, the simple definition would be to say that we are trying to learn
spk_0
mapping from X to Y, but we don't know the ground truth why for the training samples X.
spk_0
So in unsupervised learning, we are basically trying to find pattern that define a population
spk_0
or any kind of outcome without actually knowing the outcome.
spk_0
Exactly.
spk_0
And one of the things that makes unsupervised learning quite important and in fact, even some
spk_0
pioneers in AI say that it might be the future is that you don't need to have labels.
spk_0
Right.
spk_0
So you can gather much bigger data sets more.
spk_0
Yeah, and as we already discussed, labels, especially in healthcare, are extremely costly,
spk_0
extremely rare.
spk_0
So being able to do prediction without any label would be a huge, huge step up.
spk_0
Exactly.
spk_0
And this is actually to come back to the article.
spk_0
That's a nice example of this because this article extracts the ground truth label from
spk_0
the medical records of the patients, which means, which means that for every patient in
spk_0
the training set, you have to access to have access to the medical records, you have
spk_0
to extract that label, you have to verify that the label is accurate and you have also
spk_0
a lot of other logistically issues with this.
spk_0
Right.
spk_0
And this is what actually is making the main quality of the article because we know that
spk_0
there were already other studies published about depression prediction from Facebook.
spk_0
But these previous studies, like, were more based on declarative labels from patients.
spk_0
So patients saying, I am depressed, I am non-depressed, which is very different from actually
spk_0
using the ground truth from a real medical record.
spk_0
That's right.
spk_0
So yeah, in fact, this article uses in house that I said from one academic center in the
spk_0
US that has about 700 patients.
spk_0
And in that group of 700 patients, there is about 100 patients who have had a diagnosis
spk_0
of depression in their medical records.
spk_0
So again, like we discussed during the last episode, a very significant class imbalance
spk_0
in that dataset.
spk_0
Definitely.
spk_0
Here you have an imbalance, which is of basically for every, out of seven patients,
spk_0
only one, as a positive label, in the sense.
spk_0
Okay.
spk_0
So maybe you can tell us the performances of that model, is that good?
spk_0
Yes.
spk_0
So the performance, the performance is all right.
spk_0
I think that the goal of this paper is not so much to establish an amazing performance
spk_0
in the sense that the performance actually is of the same order as the performance of
spk_0
the algorithms that have been reported previously.
spk_0
What I found most interesting in this article is actually to make an assessment of what are
spk_0
the inputs that actually are the most important when it comes to predicting depression.
spk_0
Right.
spk_0
And in that regard, again, we are reaching to interpretability for the models.
spk_0
Knowing which feature is important to do prediction.
spk_0
Exactly.
spk_0
And it turns out that the model that is using that study is in fact quite simple.
spk_0
It's a logistic regression model in a few words.
spk_0
And the fact that this model is simple actually simplifies its interpretation.
spk_0
And so the main finding of this study is that the feature that by far is the most relevant
spk_0
in predicting depression is a language, the Facebook language that is used in the post
spk_0
of the patient.
spk_0
And I can tell you a little bit about how the authors actually have assessed this.
spk_0
Right.
spk_0
So they have taken, for example, one Facebook post from one individual in the data sets.
spk_0
And from that Facebook post, they extract language topics using the latent Dirichlet allocation
spk_0
method.
spk_0
They extract language topics that this Facebook post basically exhibits.
spk_0
Right.
spk_0
And just for the sake of discussion, you mentioned that the algorithm is basically logistic
spk_0
regression.
spk_0
Yes.
spk_0
Is that actually machine learning?
spk_0
Can we discuss that?
spk_0
Yeah, that's a tricky question.
spk_0
I think it depends really on what is the definition of machine learning.
spk_0
But if you think of even some of the most complicated networks today, deep neural networks.
spk_0
For example, let's say deep convolutional neural networks in computer vision, the building
spk_0
block of those networks.
spk_0
Right.
spk_0
When you look at a single neuron, this is essentially a logistic regression.
spk_0
Exactly.
spk_0
So I would say that the answer to that question is not necessarily that important.
spk_0
It's a matter of language.
spk_0
But my answer will be that I think it's okay to cause this machine learning.
spk_0
Okay.
spk_0
And maybe we can also stress on the fact that logistic regression is much more easy to understand
spk_0
because it lets complex than a deep neural network.
spk_0
So that is in that particular setting, an advantage of that technique.
spk_0
Exactly.
spk_0
And actually to come back to my answer, I think that what is important is razors and trying
spk_0
to say that logistic regression is not machine learning, but machine learning is something
spk_0
else that have emerged in the past decade.
spk_0
It's actually a better answer to say that it is machine learning.
spk_0
And to recognize that machine learning has been around for actually a long time.
spk_0
The only thing that really is evolving and changing is the complexity of the model
spk_0
architecture that we use.
spk_0
Right.
spk_0
So just to give a little bit of context, there is some kind of war waging between statisticians
spk_0
and machine learning or scientists.
spk_0
But what you're saying basically that we need to go a little bit beyond that and focus
spk_0
on the interest and the importance of the question that is asked in a given study.
spk_0
I would agree with that.
spk_0
And so to come back to this study, so a simple, as we said, a simple logistic regression model
spk_0
that takes a few different inputs to predict the probability of experiencing depression.
spk_0
The main finding of this article is that the most important feature is a language in your
spk_0
Facebook post.
spk_0
And it turned out that the authors are able to extract what are the topics that they
spk_0
are found to be most positively associated with the future depression diagnosis.
spk_0
And those in fact are pretty common sense.
spk_0
So for example, this is deep-press mood and feelings.
spk_0
So words like tears, cry, topic of loneliness.
spk_0
If you write, for example, the word miss in your post, topics of hostility with a word
spk_0
like hate, somatic complaints.
spk_0
If you write about hurt and sick and any word that also refer to medicine like hospital
spk_0
and pain.
spk_0
Okay, great.
spk_0
Anything else to add?
spk_0
Yeah, I think maybe we can close by obviously reminding that this is an important medical
spk_0
issue.
spk_0
Depression is a very debitating illness that affects too many people in the US and worldwide.
spk_0
And given the number of Facebook users around the world, the potential of screening
spk_0
individuals, the other Facebook activity is a very appealing prospect that we hope to
spk_0
see reach its promise in the future.
spk_0
Yeah, I really, really agree with you.
spk_0
But just to give the other side of the coin here, we also know and there have been studies
spk_0
published about that.
spk_0
That's Facebook users are more prone to depression than other people.
spk_0
So we need also to keep that in mind and it could also be a potential bias of that study.
spk_0
But still, I agree with you.
spk_0
Yeah, you mean, yeah, if you step back even more and look at the net effect, you mean
spk_0
of having Facebook?
spk_0
This goes beyond the scope of this study and definitely would be worse in investigating.
spk_0
What I take away from this article personally is the potential not only of screening potentially
spk_0
for depression, you know, through Facebook activity, but also the potential of reacting to
spk_0
the prediction that will be made by such a model.
spk_0
Yeah, actionable outcome.
spk_0
Exactly.
spk_0
For example, you know, if this model were to find out that a Facebook user is depressed
spk_0
and maybe in the worst case, you know, it's depressed with thoughts of death and suicide
spk_0
attempt, that there could be a reaction from Facebook, for example, by suggesting positive
spk_0
messages or ads for mental health resources and so on.
spk_0
The second thing that I take away is that data sharing is often seen as harmful.
spk_0
There's a lot of articles in the media about the danger of data sharing, but really what
spk_0
this article shows is that if it's not used maliciously, it also has a real positive potential
spk_0
when used to tackle issues related to human health.
spk_0
Yes, exactly.
spk_0
What we need to keep in mind is that everything is not black or white.
spk_0
It's a balance.
spk_0
And it's also the responsibility of the people who own that data and we are talking about
spk_0
Facebook, but we could also discuss about medical data from the hospital.
spk_0
The responsibility of the people owning that data to actually do something with it for
spk_0
the better, for the better good of everyone.
spk_0
So it's not as easy as do not use that data or use that data, but we need to try to think
spk_0
about how can we use that data in the best way possible.
spk_0
I see we are entering a new age and somehow we are not yet ready for it in the sense of
spk_0
we don't have yet the right laws, the right maybe even attitude, the way of thinking about
spk_0
what to do with this data, what is right, what is not and how to basically handle this.
spk_0
Right.
spk_0
Okay.
spk_0
So that's it for today.
spk_0
We hope you liked it.
spk_0
If you enjoyed it, subscribe and we'll see you next time.
spk_0
Bye bye.
spk_0
Merci.
spk_0
A bientôt.
Topics Covered
artificial intelligence
healthcare
depression
major depressive disorder
machine learning
Facebook language
mental health screening
psychotherapy
electroconvulsive therapy
symptoms of depression
diagnosis of depression
treatment for depression
supervised learning
unsupervised learning
predictive modeling