
Bully Watch AI

Bully Watch AI
Project Description
Design AI Explanations for Social Media Toxic Content Moderation
Group Members
Houjiang Liu, Didi Zhou, Alex Boltz, Xueqing Wang
Instructor
Min Kyung Lee
Timeline
Jan 1 --- May 8 2022
Report
Problem Space
Social media platforms have heavily used AI to remove or oppress potentially toxic content but these AI practices are seldom explained to users.
Furthermore, developers adopt different explainable NLP techniques to improve AI fairness for content moderation. Still, few are applied to facilitate social media users in determining whether they should trust and allow AI to remove or oppress potentially toxic content.
Admittedly, biases exist in natural language processing (NLP) models that are currently used for content moderation, especially in toxicity detection (Hosseini et al. 2017).
We adopt a Research-through-Design approach to explore different explanations in toxicity detection and evaluate their usefulness through user studies.


Research Questions
1. What explanations are needed for social media users when toxic content moderation happens?
2. What explainable NLP techniques can be applied for design to increase the transparency of toxic content moderation?
?
Phase I
Research through Design
Exploring AI use cases in content moderation and applying different explainable NLP techniques
Use cases
Scenario 1:
users read feeds and try to report potentially toxic content, AI first give predictions and ask them whether still want to report (Twitter)

Scenario 2:
users may send potentially toxic content as comments, AI first notify users and ask them whether they want to keep the content (Instagram)
.jpg)
Scenario 3:
users may view some AAE as toxic post and report it, AI will notify users and tell them there may exist bias and ask them whether they want to learn more information.

Designing different explanations for AI decisions
-
Visualize saliency map
-
Generate natural language
-
Use existing linguistics rules
-
Provide raw examples

Exploring off-the-shelf AI to detect toxic content


Selecting Toxic Content

Building interactive prototype
Scenario 1:Users Read Feeds and Report Potentially Toxic Content


Scenario 2:Users Post Potentially Toxic Content


Scenario 3:Users View AAE as Toxic Post and Report It

Phase I
Evaluating High-Fi Prototypes with User Studies
User Interview

key Findings
The Dilemmas of Algorithmic Content Moderation
Participants find it necessary for social media platforms to take an active role in content moderation but they are also concerned about the extent of moderation.
-
“ I agree to use this platform. There's just some certain guarantees, like safety, for example, I can't go walk into an airport and like, start making serious threats. There's just like consequences for those actions, and you like to violate it? So do you think it is necessary? I think, yeah, they should take an active role.” - P2
A combined use of human and AI is an ideal situation of content moderation but most of them consider AI should only provide suggestions while leaving humans to make decisions.
-
“My first go to is like some sort of mix where you can have, like, algorithms be some sort of guardrail, but also ensure that people who are building the algorithms and then people who are working coat like side by side with it as platform like employees, composed like a diverse mix of people, so they can kind of play out.” - P1
The Utility in Designing AI Explanations for Users
Participants to truly visualized the heuristics of algorithmic content moderation.
-
“It definitely made me reflective on how I how I view content, moderation, other people's posts, and what makes me comfortable in terms of misinformation or truthful information.” - P1
We received detailed, specific feedback in terms of features, wording tweaks, and cognition.
-
“I wish I could give feedback to the platform and be like this is something you guys need to take like a closer look at.” - P1
Users learned more deeply, in an interactive setting, about how algorithms work, and their biases.
-
“It is important for me to understand how algorithms work and how they were developed” - P5
Participants learned specifics about algorithmic bias, and in many cases the experience eased their concerns.
-
“I think it's just a really good combination of the algorithm and informing the user. It eliminated my concerns.” - P2
But obviously, not in all cases.
-
“I just think that now I would just be more suspicious of an algorithm.” - P3
When Algorithms Fail
Participants understood that algorithms have faults and limitations
-
“There’s certain things that the algorithm wouldn't catch” - P5
Some of which aren’t fixable
-
“You can never win when it comes to these types of issues of racial biases or like biased algorithms are like problematic. ” - P1
However, users still believe in the necessity of algorithms and in trying to mitigate these issues despite the complexity failures bring
-
“Like it's (the algorithm) made by a person. There's an old saying, “garbage in garbage out” right? … But you need to try to make sure it is as transparent as possible… But I think that's impossible. So it's not going to be transparent. But that doesn't mean that's the right way to do it. I think you still should go in that direction, but I was just gonna be difficult… So, I think it is necessary, but there is a certain complex complexity behind it.” - P4
Design Implications
User Needs
Solution

Moderation is necessary but should leave humans to make decisions.
-
Allow customizable preference settings.

AAE bias detection and explanation is very useful but information can be conveyed more clearly
-
Being concise
-
Making it visually enticing

Explanation is useful but wordy and hard to understand
-
Using more visual keys and languages.
Refined Design

AAE Flow
