A Machine Learning Approach of Text Classification forHigh- and Low-Resource Languages

A large amount of data have been published online in textual format for the last decade because of the advancement of informationand communication technologies. This is an open challenge to organize and classify large amounts of textual data automatically,especially for a language that has limited resources available online. In this study, two types of approaches are adopted for exper-iments. First one is a traditional strategy that uses six (06) classical state-of-the-art classification models (1. decision tree (DT),2. logistic regression (LR), 3. support vector machine (SVM), 4. k-nearest neighbour (k-NN), 5. Naive Bayes (NB), and 6. randomforest (RF)) along with two (02) ensemble methods (1. Adaboost and 2. gradient boosting (GB)) and second modeling technique isour proposed voting based ensembling scheme. Models are trained on a 75-25 split where 75% of data is used for training and 25%for testing. The evaluation of the classification models is carried out based on accuracy, precision, recall, and F1-score indexes.The experimental outcomes witnessed that for the traditional approach, gradient boosting outperformed for the limited resourcelanguage with 98.08% F1-score, while SVM performed better (97.34% F1-score) for the resource-rich language.

Anahtar Kelimeler

Deep learning, Implied threat detection, Machine learning, Natural language processing

Kaynak

Computational Intelligence

WoS Q Değeri

Q3

Cilt

41

Sayı

4

Künye

Raza, M. O., Mahoto, N. A., Shaikh, A., Pathan, N., Alshahrani, H., & Elmagzoub, M. A.. (2025). A Machine Learning Approach of Text Classification for High‐ and Low‐Resource Languages. Computational Intelligence, 41(4). https://doi.org/10.1111/coin.70114

Bağlantı

https://doi.org/10.1111/coin.70114
https://hdl.handle.net/20.500.12436/9492

Koleksiyon

Bilgisayar Mühendisliği Bölümü Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu
WoS İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

A Machine Learning Approach of Text Classification forHigh- and Low-Resource Languages

Dosyalar

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Araştırma projeleri

Organizasyon Birimleri

Dergi sayısı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon

Onay

İnceleme

Ekleyen

Referans Veren