Peterson-Salahuddin, C. (2024). Repairing the harm: Toward an algorithmic reparations approach to hate speech content moderation. Big Data & Society, 11(2).
Link: https://doi.org/10.1177/20539517241245333
Open access: Yes
Notes: Platforms’ approach to content moderation—whether in editorial, flagging, or algorithmic forms—has been justifiably criticized for addressing the scale of online violence instead of its complexities and impacts. In this context—and building on scholarship that proposes models of content moderation that draw from restorative and transformative justice—Peterson-Salahuddin outlines the need for a reparative approach. Such an approach would have three key principles. First, they would be designed for redress (instead of removal), facilitating reparations from the offender to the victims. Second, they would be designed for context (instead of words), identifying the nature of the harm to avoid risking over or under-blocking. Finally, it would be designed for equity (instead of equality), aiming to respond to the larger systemic power imbalances. Through these designs, “instead of being punitive, a reparative approach to algorithmic content moderation would use algorithms to establish reparations to address the deeper harm(s) the offending post perpetuated” (p. 7).
Abstract: Content moderation algorithms influence how users understand and engage with social media platforms. However, when identifying hate speech, these automated systems often contain biases that can silence or further harm marginalized users. Recently, scholars have offered both restorative and transformative justice frameworks as alternative approaches to platform governance to mitigate harms caused to marginalized users. As a complement to these recent calls, in this essay, I take up the concept of reparation as one substantive approach social media platforms can use alongside and within these justice frameworks to take actionable steps toward addressing, undoing and proactively preventing the harm caused by algorithmic content moderation. Specifically, I draw on established legal and legislative reparations frameworks to suggest how social media platforms can reconceptualize algorithmic content moderation in ways that decrease harm to marginalized users when identifying hate speech. I argue that the concept of reparations can reorient how researchers and corporate social media platforms approach content moderation, away from capitalist impulses and efficiency and toward a framework that prioritizes creating an environment where individuals from marginalized communities feel safe, protected and empowered.