A Machine Learning Approach of Text Classification forHigh- and Low-Resource Languages
| dc.authorwosid | FWU-2100-2022 | |
| dc.authorwosid | FOF-9383-2022 | |
| dc.authorwosid | S-4815-2016 | |
| dc.authorwosid | OEB-8153-2025 | |
| dc.authorwosid | GQQ-1607-2022 | |
| dc.authorwosid | LLM-4686-2024 | |
| dc.contributor.author | Raza, Muhammad Owais | |
| dc.contributor.author | Mahoto, Naeem Ahmed | |
| dc.contributor.author | Shaikh, Asadullah | |
| dc.contributor.author | Pathan, Nazia | |
| dc.contributor.author | Alshahrani, Hani Mohammed | |
| dc.contributor.author | Elmagzoub, Mohamed A. | |
| dc.contributor.department-temp | ||
| dc.date.accessioned | 2026-05-05T13:48:56Z | |
| dc.date.issued | 2025 | |
| dc.department | Mühendislik ve Doğa Bilimleri Fakültesi | |
| dc.description.abstract | A large amount of data have been published online in textual format for the last decade because of the advancement of informationand communication technologies. This is an open challenge to organize and classify large amounts of textual data automatically,especially for a language that has limited resources available online. In this study, two types of approaches are adopted for exper-iments. First one is a traditional strategy that uses six (06) classical state-of-the-art classification models (1. decision tree (DT),2. logistic regression (LR), 3. support vector machine (SVM), 4. k-nearest neighbour (k-NN), 5. Naive Bayes (NB), and 6. randomforest (RF)) along with two (02) ensemble methods (1. Adaboost and 2. gradient boosting (GB)) and second modeling technique isour proposed voting based ensembling scheme. Models are trained on a 75-25 split where 75% of data is used for training and 25%for testing. The evaluation of the classification models is carried out based on accuracy, precision, recall, and F1-score indexes.The experimental outcomes witnessed that for the traditional approach, gradient boosting outperformed for the limited resourcelanguage with 98.08% F1-score, while SVM performed better (97.34% F1-score) for the resource-rich language. | |
| dc.identifier.citation | Raza, M. O., Mahoto, N. A., Shaikh, A., Pathan, N., Alshahrani, H., & Elmagzoub, M. A.. (2025). A Machine Learning Approach of Text Classification for High‐ and Low‐Resource Languages. Computational Intelligence, 41(4). https://doi.org/10.1111/coin.70114 | |
| dc.identifier.doi | 10.1111/coin.70114 | |
| dc.identifier.endpage | 17 | |
| dc.identifier.issn | 0824-7935 | |
| dc.identifier.issn | 1467-8640 | |
| dc.identifier.issue | 4 | |
| dc.identifier.startpage | 1 | |
| dc.identifier.uri | https://doi.org/10.1111/coin.70114 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12436/9492 | |
| dc.identifier.volume | 41 | |
| dc.identifier.wos | 001550494700001 | |
| dc.identifier.wosquality | Q3 | |
| dc.indekslendigikaynak | Web of Science | |
| dc.language.iso | en | |
| dc.publisher | Wiley | |
| dc.relation.ispartof | Computational Intelligence | |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Öğrenci | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.subject | Deep learning | |
| dc.subject | Implied threat detection | |
| dc.subject | Machine learning | |
| dc.subject | Natural language processing | |
| dc.title | A Machine Learning Approach of Text Classification forHigh- and Low-Resource Languages | |
| dc.type | Article | |
| dspace.entity.type | Publication |









