Corazza, M., Menini, S., Cabrio, E., Tonelli, S., & Villata, S. (2019). Cross-platform evaluation for Italian hate speech detection. CLiC-it 2019 – 6th Annual Conference of the Italian Association for Computational Linguistics, Bari, Italy.
Open access: Yes
Notes: In this conference paper, Corazza et al. share findings of comparative evaluation of harmful speech databased across platforms (Facebook, Twitter, Instagram and WhatsApp) in Italian. Their findings are important for two reasons: First, they expand knowledge of hate speech into languages besides English. And second, they note the importance of cross-platform integration when analyzing hate speech, noting that “language used on social platforms has peculiarities that might not be present in generic corpora, and that it is therefore advisable to use domain-specific resources.”
Abstract: Despite the number of approaches recently proposed in NLP for detecting abusive language on social networks , the issue of developing hate speech detection systems that are robust across different platforms is still an unsolved problem. In this paper we perform a comparative evaluation on datasets for hate speech detection in Italian, extracted from four different social media platforms, i.e. Facebook, Twitter, Instagram and What-sApp. We show that combining such platform-dependent datasets to take advantage of training data developed for other platforms is beneficial, although their impact varies depending on the social network under consideration