A Machine Learning Approach of Text Classification forHigh- and Low-Resource Languages

dc.authorwosidFWU-2100-2022
dc.authorwosidFOF-9383-2022
dc.authorwosidS-4815-2016
dc.authorwosidOEB-8153-2025
dc.authorwosidGQQ-1607-2022
dc.authorwosidLLM-4686-2024
dc.contributor.authorRaza, Muhammad Owais
dc.contributor.authorMahoto, Naeem Ahmed
dc.contributor.authorShaikh, Asadullah
dc.contributor.authorPathan, Nazia
dc.contributor.authorAlshahrani, Hani Mohammed
dc.contributor.authorElmagzoub, Mohamed A.
dc.contributor.department-temp
dc.date.accessioned2026-05-05T13:48:56Z
dc.date.issued2025
dc.departmentMühendislik ve Doğa Bilimleri Fakültesi
dc.description.abstractA large amount of data have been published online in textual format for the last decade because of the advancement of informationand communication technologies. This is an open challenge to organize and classify large amounts of textual data automatically,especially for a language that has limited resources available online. In this study, two types of approaches are adopted for exper-iments. First one is a traditional strategy that uses six (06) classical state-of-the-art classification models (1. decision tree (DT),2. logistic regression (LR), 3. support vector machine (SVM), 4. k-nearest neighbour (k-NN), 5. Naive Bayes (NB), and 6. randomforest (RF)) along with two (02) ensemble methods (1. Adaboost and 2. gradient boosting (GB)) and second modeling technique isour proposed voting based ensembling scheme. Models are trained on a 75-25 split where 75% of data is used for training and 25%for testing. The evaluation of the classification models is carried out based on accuracy, precision, recall, and F1-score indexes.The experimental outcomes witnessed that for the traditional approach, gradient boosting outperformed for the limited resourcelanguage with 98.08% F1-score, while SVM performed better (97.34% F1-score) for the resource-rich language.
dc.identifier.citationRaza, M. O., Mahoto, N. A., Shaikh, A., Pathan, N., Alshahrani, H., & Elmagzoub, M. A.. (2025). A Machine Learning Approach of Text Classification for High‐ and Low‐Resource Languages. Computational Intelligence, 41(4). https://doi.org/10.1111/coin.70114
dc.identifier.doi10.1111/coin.70114
dc.identifier.endpage17
dc.identifier.issn0824-7935
dc.identifier.issn1467-8640
dc.identifier.issue4
dc.identifier.startpage1
dc.identifier.urihttps://doi.org/10.1111/coin.70114
dc.identifier.urihttps://hdl.handle.net/20.500.12436/9492
dc.identifier.volume41
dc.identifier.wos001550494700001
dc.identifier.wosqualityQ3
dc.indekslendigikaynakWeb of Science
dc.language.isoen
dc.publisherWiley
dc.relation.ispartofComputational Intelligence
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Öğrenci
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectDeep learning
dc.subjectImplied threat detection
dc.subjectMachine learning
dc.subjectNatural language processing
dc.titleA Machine Learning Approach of Text Classification forHigh- and Low-Resource Languages
dc.typeArticle
dspace.entity.typePublication

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
Raza-2025-A-machine-learning-approach-of-text.pdf
Boyut:
5.27 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Article file

Lisans paketi

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
license.txt
Boyut:
1.17 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: