A Machine Learning Approach of Text Classification forHigh- and Low-Resource Languages

Yükleniyor...
Küçük Resim

Tarih

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Wiley

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Araştırma projeleri

Organizasyon Birimleri

Dergi sayısı

Özet

A large amount of data have been published online in textual format for the last decade because of the advancement of informationand communication technologies. This is an open challenge to organize and classify large amounts of textual data automatically,especially for a language that has limited resources available online. In this study, two types of approaches are adopted for exper-iments. First one is a traditional strategy that uses six (06) classical state-of-the-art classification models (1. decision tree (DT),2. logistic regression (LR), 3. support vector machine (SVM), 4. k-nearest neighbour (k-NN), 5. Naive Bayes (NB), and 6. randomforest (RF)) along with two (02) ensemble methods (1. Adaboost and 2. gradient boosting (GB)) and second modeling technique isour proposed voting based ensembling scheme. Models are trained on a 75-25 split where 75% of data is used for training and 25%for testing. The evaluation of the classification models is carried out based on accuracy, precision, recall, and F1-score indexes.The experimental outcomes witnessed that for the traditional approach, gradient boosting outperformed for the limited resourcelanguage with 98.08% F1-score, while SVM performed better (97.34% F1-score) for the resource-rich language.

Açıklama

Anahtar Kelimeler

Deep learning, Implied threat detection, Machine learning, Natural language processing

Kaynak

Computational Intelligence

WoS Q Değeri

Scopus Q Değeri

Cilt

41

Sayı

4

Künye

Raza, M. O., Mahoto, N. A., Shaikh, A., Pathan, N., Alshahrani, H., & Elmagzoub, M. A.. (2025). A Machine Learning Approach of Text Classification for High‐ and Low‐Resource Languages. Computational Intelligence, 41(4). https://doi.org/10.1111/coin.70114

Onay

İnceleme

Ekleyen

Referans Veren