Who is the “Human” in Human-Centered Machine Learning: The Case of Predicting Mental Health from Social Media

Stevie Chancellor, Eric P.S Baumer, Munmun De Choudhury

The paper [1] reports Human-Centered Machine Learning from a theoretical point of view, and the goal set out is to question the role of the Human in this relatively new domain. To narrow down an otherwise broad approach, the authors have looked at social media with keen interest and tried to find its association with peoples’ cognitive health. These two areas are compelling and popular points of discussion. In the study, they have also attempted to ponder over the representation of humans in this context, existing biases or possible alternatives, and who benefits, who gets harmed in this process. They have talked in detail about five ways they think humans are represented, and commented on the relationships between them. Further, they have touched upon the scalability of the context, dwelled on the consequences on different stakeholder categories. To conclude, they have emphasized on the need for guidelines to prevent the implications from getting adverse, along with a mention of the need to keep “humans in the loop” and always adopt a Human-centered approach to ML in the future.

On reading the title of this paper, one is intrigued by the rhetorical question in its first half. It confuses the reader. The authors have explained this in their findings, where they categorized the representations of humans with various entities, including disorders, social media, scientific, ML, and person. These together are used to identify inconsistencies in the current methods of ML. The second half of the title, however, does give an idea about the topics of focus in the research. It also creates certain expectations in the mind of the reader to find a more detailed discussion about how Social Media and Mental Health interact with each other, via the user. This aspect, I believe, lacked in the study conducted. The research catered to the interaction between humans and their cognitive health, people and their usage of social media, and it stopped there - there was no connection established between the two. Moreover, the paper caters to Mental Disorders and their symptoms rather than Health, and that should reflect in the title.

The paper begins with a very detailed description of the choice of field of research, methodology etc., thereby establishing high expectations from the study like the authors have backgrounds in the domain of discussion, so there shouldn’t be a bias towards any one side of the conversation - psychological or machine learning. The authors tried to maintain a balance between the two wherever possible. The highly textual and theoretical paper would have put across its point in a more effective manner had it discoursed over some examples of how social media posts are used to for processing, how a persons’ mental health is judged from it. Such visualisations can be used to draw a comparision between ML practices and HCML practices.

The research method followed in this paper was a literature review, paired with inductive coding of 55 papers published across multiple disciplines and conferences. The thorough research is a study, and a commentary on existing work and hence is not novel. A major limitation of the keyword search based approach adopted by the authors in their research was leaving out words like general wellbeing and sentiment analysis of the content posted on social media. The approach to mental health was limited to disorders and clinical patients in this study. The rest of the population who does not face any major illnesses aren’t considered.

Further, not all posts on social media may have the keywords that may imply mental ill-health. It appears very binary in its approach: either there is an illness, or a person is perfect. Some text may potentially show depression, suicidal tendencies, or worse but may not show up in the search. More commentary on the research methods of the papers studied could have resolved these two shortcomings.

The paper also brushes over the possible biases that these selected studies may have had that hamper their results and also biases in current ML practices that have adverse ethical or societal impacts. The former could have been talked about more since HCML is a relatively newer field, and there are bound to be more mistakes in past practices. Since people are treated as mear data points whose data is ‘extracted’ and are ‘dehumanized’, firm emphasis has been laid on being cautious and having laws for any ML practices that involve or impact humans, popping up the question of accountability and who’s responsible. This study should be extended to look into the points mentioned above for a more holistic view.

Reference

Stevie Chancellor, Eric P. S. Baumer, and Munmun De Choudhury. 2019. Who is the “Human” in Human-Centered Machine Learning: The Case of Predicting Mental Health from Social Media. Proceedings of the ACM on Human-Computer Interaction 3, CSCW: 147:1–147:32. https://doi.org/10.1145/3359249