{"id":25221,"date":"2024-09-16T16:37:15","date_gmt":"2024-09-16T10:52:15","guid":{"rendered":"https:\/\/www.revoscience.com\/en\/?p=25221"},"modified":"2024-09-16T16:37:22","modified_gmt":"2024-09-16T10:52:22","slug":"ai-powered-tool-detected-hate-speech-in-southeast-asian-languages","status":"publish","type":"post","link":"https:\/\/www.revoscience.com\/en\/ai-powered-tool-detected-hate-speech-in-southeast-asian-languages\/","title":{"rendered":"AI-powered tool detected hate speech in Southeast Asian languages"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"675\" height=\"421\" src=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-675x421.jpg\" alt=\"\" class=\"wp-image-25222\" style=\"aspect-ratio:16\/9;object-fit:cover\" title=\"\" srcset=\"https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-675x421.jpg 675w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-642x400.jpg 642w, https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg 740w\" sizes=\"auto, (max-width: 675px) 100vw, 675px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Researchers developed SGHateCheck, the first functional test specifically tailored to evaluate hate speech in the multilingual environments of Singapore and the broader Southeast Asia.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The internet, and particularly social media, has grown exponentially over the last decades. The nature of social media allows anyone to go online and create content they find interesting, whether appropriate or not. One form of inappropriate content is hate speech\u2014offensive or threatening speech targeting certain people based on their ethnicity, religion, sexual orientation, and the like.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Hate speech detection models are computational systems that identify and classify online comments as hate speech. \u201cThese models are crucial in moderating online content and mitigating the spread of harmful speech, particularly on social media,\u201d said Assistant Professor Roy Lee from the Singapore University of Technology and Design (SUTD). Evaluating the performance of hate speech detection models is important, but traditional evaluation using held-out test sets often fails to properly assess the model\u2019s performance due to inherent bias within the datasets.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To overcome this limitation, HateCheck and Multilingual HateCheck (MHC) were introduced as functional tests that capture the complexity and diversity of hate speech by simulating real-world scenarios. In their paper \u201c<a href=\"https:\/\/doi.org\/10.18653\/v1\/2024.woah-1.24\" target=\"_blank\" rel=\"noopener\">SGHateCheck: Functional tests for detecting hate speech in low-resource languages of Singapore<\/a>\u201d, Asst Prof Lee and his team builds on the frameworks of HateCheck and MHC to develop SGHateCheck, an artificial intelligence (AI)-powered tool that can distinguish between hateful and non-hateful comments in the specific context of Singapore and Southeast Asia.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Creating an evaluation tool specifically for the region\u2019s linguistic and cultural context was necessary. This is because current hate speech detection models and datasets are mostly based on Western contexts, which do not accurately represent specific social dynamics and issues in Southeast Asia. \u201cSGHateCheck aims to address these gaps by providing functional tests tailored to the region\u2019s specific needs, ensuring more accurate and culturally sensitive detection of hate speech,\u201d said Asst Prof Lee.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Unlike HateCheck and MHC, SGHateCheck uses large language models (LLMs) to translate and paraphrase test cases into Singapore\u2019s four main languages\u2014English, Mandarin, Tamil, and Malay. Native annotators then refine these test cases to ensure cultural relevance and accuracy. The end result is over 11,000 test cases meticulously annotated as hateful or non-hateful, which allows for a more nuanced platform to evaluate hate speech detection models.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Moreover, while MHC includes many languages, it does not have the same level of regional specificity as SGHateCheck. A comprehensive list of functional tests tailored to the region\u2019s distinct linguistic features (for example, Singlish) paired with expert guidance ensures that SGHateCheck tests are useful and relevant. \u201cThis regional focus allows SGHateCheck to more accurately capture and evaluate the manifestations of hate speech that may not be adequately addressed by broader, more general frameworks,\u201d emphasized Asst Prof Lee.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The team also found that LLMs trained on monolingual data sets are often biased towards non-hateful classifications. On the other hand, LLMs trained on multilingual datasets have a more balanced performance and can more accurately detect hate speech across various languages due to their exposure to a broader range of language expressions and cultural contexts. This underscores the importance of including culturally diverse and multilingual training data for applications in multilingual regions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">SGHateCheck was specifically developed to solve a real-world issue in Southeast Asia. It is poised to play a significant role by enhancing the detection and moderation of hate speech in online environments in these regions, helping to foster a more respectful and inclusive online space. Social media, online forums and community platforms, and news and media websites are just some of the many areas where the implementation of SGHateCheck will be valuable.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Fortunately, a new content moderation application that uses SGHateCheck is already on Asst Prof Lee\u2019s list of future plans. He also aims to expand SGHateCheck to include other Southeast Asian languages such as Thai and Vietnamese.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">SGHateCheck demonstrates how SUTD\u2019s ethos of integrating cutting-edge technological advancements with thoughtful design principles can lead to impactful real-world solutions. Through the use of design, AI, and technology, SGHateCheck was developed to analyze local languages and social dynamics in order to meet a specific societal need.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u201cBy focusing on creating a hate speech detection tool that is not only technologically sophisticated but also culturally sensitive, the study underscores the importance of human-centered approach in technological research and development,\u201d said Asst Prof Lee.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Researchers developed SGHateCheck, the first functional test specifically tailored to evaluate hate speech in the multilingual environments of Singapore and the broader Southeast Asia.<\/p>\n","protected":false},"author":2,"featured_media":25222,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[314,14,47],"tags":[],"class_list":["post-25221","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-society","category-innovation","category-it"],"featured_image_urls":{"full":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",740,461,false],"thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-200x200.jpg",200,200,true],"medium":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-642x400.jpg",642,400,true],"medium_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",740,461,false],"large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-675x421.jpg",675,421,true],"1536x1536":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",740,461,false],"2048x2048":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",740,461,false],"ultp_layout_landscape_large":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",740,461,false],"ultp_layout_landscape":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",740,461,false],"ultp_layout_portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-600x461.jpg",600,461,true],"ultp_layout_square":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-600x461.jpg",600,461,true],"newspaper-x-single-post":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",740,461,false],"newspaper-x-recent-post-big":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-550x360.jpg",550,360,true],"newspaper-x-recent-post-list-image":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol-95x65.jpg",95,65,true],"web-stories-poster-portrait":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",640,399,false],"web-stories-publisher-logo":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",96,60,false],"web-stories-thumbnail":["https:\/\/www.revoscience.com\/en\/wp-content\/uploads\/2024\/09\/hatespeech-symbol.jpg",150,93,false]},"author_info":{"info":["RevoScience"]},"category_info":"<a href=\"https:\/\/www.revoscience.com\/en\/category\/society\/\" rel=\"category tag\">Society<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/innovation\/\" rel=\"category tag\">Innovation<\/a> <a href=\"https:\/\/www.revoscience.com\/en\/category\/news\/it\/\" rel=\"category tag\">IT<\/a>","tag_info":"IT","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/25221","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/comments?post=25221"}],"version-history":[{"count":1,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/25221\/revisions"}],"predecessor-version":[{"id":25223,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/posts\/25221\/revisions\/25223"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media\/25222"}],"wp:attachment":[{"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/media?parent=25221"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/categories?post=25221"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.revoscience.com\/en\/wp-json\/wp\/v2\/tags?post=25221"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}