Bully Watch AI

Overview

Bully Watch AI

Project Description

Design AI Explanations for Social Media Toxic Content Moderation

Group Members

Houjiang Liu, Didi Zhou, Alex Boltz, Xueqing Wang

Instructor

Min Kyung Lee

Timeline

Jan 1 --- May 8 2022

Report

Problem Space

Social media platforms have heavily used AI to remove or oppress potentially toxic content but these AI practices are seldom explained to users.

Furthermore, developers adopt different explainable NLP techniques to improve AI fairness for content moderation. Still, few are applied to facilitate social media users in determining whether they should trust and allow AI to remove or oppress potentially toxic content.

Admittedly, biases exist in natural language processing (NLP) models that are currently used for content moderation, especially in toxicity detection (Hosseini et al. 2017).

We adopt a Research-through-Design approach to explore different explanations in toxicity detection and evaluate their usefulness through user studies.

Research

Research Questions

1. What explanations are needed for social media users when toxic content moderation happens?

2. What explainable NLP techniques can be applied for design to increase the transparency of toxic content moderation?

？

Phase I

Research through Design

Exploring AI use cases in content moderation and applying different explainable NLP techniques

Use cases

Scenario 1:

users read feeds and try to report potentially toxic content, AI first give predictions and ask them whether still want to report (Twitter)

Scenario 2:

users may send potentially toxic content as comments, AI first notify users and ask them whether they want to keep the content (Instagram)

Scenario 3:

users may view some AAE as toxic post and report it, AI will notify users and tell them there may exist bias and ask them whether they want to learn more information.

Designing different explanations for AI decisions

Visualize saliency map
Generate natural language
Use existing linguistics rules
Provide raw examples

Exploring off-the-shelf AI to detect toxic content

Selecting Toxic Content

Building interactive prototype

Design

Link to Interactive High Fidelity prototype

Scenario 1：Users Read Feeds and Report Potentially Toxic Content

Scenario 2：Users Post Potentially Toxic Content

Scenario 3：Users View AAE as Toxic Post and Report It

Phase I

Evaluating High-Fi Prototypes with User Studies

User Interview

key Findings

The Dilemmas of Algorithmic Content Moderation

Participants find it necessary for social media platforms to take an active role in content moderation but they are also concerned about the extent of moderation.

“ I agree to use this platform. There's just some certain guarantees, like safety, for example, I can't go walk into an airport and like, start making serious threats. There's just like consequences for those actions, and you like to violate it? So do you think it is necessary? I think, yeah, they should take an active role.” - P2

A combined use of human and AI is an ideal situation of content moderation but most of them consider AI should only provide suggestions while leaving humans to make decisions.

“My first go to is like some sort of mix where you can have, like, algorithms be some sort of guardrail, but also ensure that people who are building the algorithms and then people who are working coat like side by side with it as platform like employees, composed like a diverse mix of people, so they can kind of play out.” - P1

The Utility in Designing AI Explanations for Users

Participants to truly visualized the heuristics of algorithmic content moderation.

“It definitely made me reflective on how I how I view content, moderation, other people's posts, and what makes me comfortable in terms of misinformation or truthful information.” - P1

We received detailed, specific feedback in terms of features, wording tweaks, and cognition.

“I wish I could give feedback to the platform and be like this is something you guys need to take like a closer look at.” - P1

Users learned more deeply, in an interactive setting, about how algorithms work, and their biases.

“It is important for me to understand how algorithms work and how they were developed” - P5

Participants learned specifics about algorithmic bias, and in many cases the experience eased their concerns.

“I think it's just a really good combination of the algorithm and informing the user. It eliminated my concerns.” - P2

But obviously, not in all cases.

“I just think that now I would just be more suspicious of an algorithm.” - P3

When Algorithms Fail

Participants understood that algorithms have faults and limitations

“There’s certain things that the algorithm wouldn't catch” - P5

Some of which aren’t fixable

“You can never win when it comes to these types of issues of racial biases or like biased algorithms are like problematic. ” - P1

However, users still believe in the necessity of algorithms and in trying to mitigate these issues despite the complexity failures bring

“Like it's (the algorithm) made by a person. There's an old saying, “garbage in garbage out” right? … But you need to try to make sure it is as transparent as possible… But I think that's impossible. So it's not going to be transparent. But that doesn't mean that's the right way to do it. I think you still should go in that direction, but I was just gonna be difficult… So, I think it is necessary, but there is a certain complex complexity behind it.” - P4

Design Implications

Final Version

User Needs

Solution

Moderation is necessary but should leave humans to make decisions.

Allow customizable preference settings.

AAE bias detection and explanation is very useful but information can be conveyed more clearly

Being concise
Making it visually enticing

Explanation is useful but wordy and hard to understand

Using more visual keys and languages.

Refined Design

AAE Flow

Report & Explanation

View All Projects

Bully Watch AI

Bully Watch AI

Project Description

Group Members

Instructor

Timeline

Report

Problem Space

Research Questions

​​1. What explanations are needed for social media users when toxic content moderation happens? 2. What explainable NLP techniques can be applied for design to increase the transparency of toxic content moderation?

？

Phase I

Research through Design

Use cases

Designing different explanations for AI decisions

Exploring off-the-shelf AI to detect toxic content

Selecting Toxic Content

Building interactive prototype

Link to Interactive High Fidelity prototype

Phase I

Evaluating High-Fi Prototypes with User Studies

User Interview

key Findings

The Dilemmas of Algorithmic Content Moderation

The Utility in Designing AI Explanations for Users

When Algorithms Fail

Design Implications

User Needs

Solution

Refined Design

AAE Flow

Report & Explanation

1. What explanations are needed for social media users when toxic content moderation happens?

2. What explainable NLP techniques can be applied for design to increase the transparency of toxic content moderation?