Irony and Sarcasm Detection in Turkish Texts: A Comparative Study of Transformer-Based Models and Ensemble Learning


ESER M., BİLGİN M.

APPLIED SCIENCES-BASEL, cilt.15, sa.23, 2025 (SCI-Expanded, Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 15 Sayı: 23
  • Basım Tarihi: 2025
  • Doi Numarası: 10.3390/app152312498
  • Dergi Adı: APPLIED SCIENCES-BASEL
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Bursa Uludağ Üniversitesi Adresli: Evet

Özet

Irony and sarcasm are forms of expression that emphasize the inconsistency between what is said and what is meant. Correctly classifying such expressions is an important text mining problem, especially on user-centered platforms such as social media. Due to the increasing prevalence of implicit expressions, this topic has become a significant area of research in Natural Language Processing (NLP). However, the simultaneous detection of ironic and sarcastic expressions is highly challenging, as both types of implicit sentiments often convey closely related meanings. To address the detection of irony and sarcasm, this study compares the performance of transformer-based models and an ensemble learning method on Turkish texts, using five textual datasets-monogram, bigram, trigram, quadrigram, and omnigram-that share the same textual content but differ in context length. To improve classification performance, an ensemble learning approach based on the Artificial Rabbit Optimization (ARO) algorithm was implemented, combining the outputs of the models to produce final predictions. The experimental results indicate that as the context width of the datasets increases, the models achieve better predictions, leading to improvements across all performance metrics. The ensemble learning method outperformed individual models in all metrics, with performance increasing as the context expanded, achieving the highest success in the omnigram dataset with 76.71% accuracy, 74.64% precision, 73.29% sensitivity, and 73.96% F-Score. This study demonstrates that both model architecture and data structure are decisive factors in text classification performance, showing that community methods can make significant contributions to the effectiveness of deep learning solutions in low-resource languages.