Application of K-Nearest Neighbour Method to Detect Hate Speech in Twitter Posts

Authors

  • Yuda Septiawan Institut Informatika dan Bisnis Darmajaya
  • Chairani Chairani

Keywords:

Sentiment analysis, K-Nearest Neighbor

Abstract

This research focuses on evaluating the effectiveness of the K-Nearest Neighbor (KNN) method in detecting hate speech on social media platforms, particularly Twitter. The tweet data was collected using the Twitter API and labelled using the Sentistrength method to determine sentiment polarity. Next, the KNN method is applied to classify tweets based on sentiment using two dataset sharing ratios, namely 90:10 and 80:20 for training and test data.The test results show that the performance of KNN in detecting hate speech is less than optimal. At a dataset division ratio of 90:10, this algorithm produces an accuracy of 60.94% with an F-measure value of 62%. Meanwhile, at a ratio of 80:20, the accuracy increases to 63.02% with an F-measure of 63%. Based on these results, it can be concluded that the KNN method has a relatively low accuracy rate and inconsistent ability to classify hate speech data on Twitter, especially in terms of the balance between precision and recall.

Downloads

Published

2024-11-21