EvaRAG: Evaluating Advanced RAG Techniques With Indexing and Distance Metrics
| dc.authorscopusid | 59149323900 | |
| dc.authorscopusid | 57791962400 | |
| dc.contributor.author | Elkıran, Harun | |
| dc.contributor.author | Rasheed, Jawad | |
| dc.contributor.author | Rasheed, Jawad | |
| dc.date.accessioned | 2026-04-08T12:48:46Z | |
| dc.date.issued | 2025 | |
| dc.department | Lisansüstü Eğitim Enstitüsü | |
| dc.department | Mühendislik ve Doğa Bilimleri Fakültesi | |
| dc.description.abstract | Retrieval Augmented Generation (RAG) has emerged as a powerful paradigm for enhancing large language models (LLMs) with external knowledge. Yet, the performance of RAG pipelines is susceptible to design choices across retrieval, similarity metrics, indexing, and reranking. Despite growing adoption, little systematic work has explored the trade-offs between retrieval quality, semantic accuracy, computational efficiency, and cost in RAG systems. This study addresses this gap by conducting a comprehensive evaluation of RAG configurations across multiple dimensions. We propose a benchmarking framework that systematically varies retrievers (Fusion, HyDe, Hierarchical, SCaNN), indexing methods (HNSW, IVF, Flat), similarity metrics (Cosine, Inner Product, L2), and rerankers (BGE, minilm) over datasets of three scales (small, medium, and large). Performance is assessed through coverage, recall, MRR, and nDCG, while semantic quality is measured using correctness, faithfulness, and relevance. Efficiency is quantified via latency, throughput, and computational cost. Our experiments reveal that HNSW–IP–Fusion– minilm achieves the strongest semantic performance, with Coverage Retrieval of 0.942, Correctness of 0.909, and Faithfulness of 0.970, making it ideal for accuracy-critical tasks. Conversely, IVF–L2–Hierarchical demonstrates the lowest latency (1.736 ns) and cost, making it suitable for real-time deployments. Reranker analysis shows modest but consistent gains for minilm over BGE, while HyDe excels in precision at the expense of efficiency. Notably, no single configuration dominates; optimal designs depend on the application’s needs, whether it is maximizing semantic accuracy, minimizing latency, or striking a balance between the two. By demonstrating concrete trade-offs, this work provides a practical foundation for scaling RAG pipelines across diverse domains, including information retrieval, enterprise search, and knowledgeintensive reasoning. | |
| dc.identifier.citation | Elkiran, H., & Rasheed, J. (2025). EvaRAG: Evaluating Advanced RAG Techniques With Indexing and Distance Metrics. IEEE Access, 13, 215724-215747. | |
| dc.identifier.doi | 10.1109/ACCESS.2025.3646665 | |
| dc.identifier.endpage | 215747 | |
| dc.identifier.issn | 2169-3536 | |
| dc.identifier.orcid | 0000-0002-5834-6210 | |
| dc.identifier.orcid | 0000-0003-3761-1641 | |
| dc.identifier.scopus | 2-s2.0-105025908643 | |
| dc.identifier.startpage | 215724 | |
| dc.identifier.uri | https://doi.org/10.1109/ACCESS.2025.3646665 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12436/9326 | |
| dc.identifier.volume | 13 | |
| dc.indekslendigikaynak | Scopus | |
| dc.language.iso | en | |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | |
| dc.relation.ispartof | IEEE Access | |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Öğrenci | |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/openAccess | |
| dc.subject | Data retrieval | |
| dc.subject | Large language model | |
| dc.subject | Natural language processing | |
| dc.subject | Question answering systems | |
| dc.subject | RAG | |
| dc.title | EvaRAG: Evaluating Advanced RAG Techniques With Indexing and Distance Metrics | |
| dc.type | Article | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | f9b9b46c-d923-42d3-b413-dd851c2e913a | |
| relation.isAuthorOfPublication.latestForDiscovery | f9b9b46c-d923-42d3-b413-dd851c2e913a |
Dosyalar
Orijinal paket
1 - 1 / 1
Yükleniyor...
- İsim:
- EvaRAG_Evaluating_Advanced_RAG_Techniques_With_Indexing_and_Distance_Metrics.pdf
- Boyut:
- 5.4 MB
- Biçim:
- Adobe Portable Document Format
- Açıklama:
- Article file
Lisans paketi
1 - 1 / 1
Yükleniyor...
- İsim:
- license.txt
- Boyut:
- 1.17 KB
- Biçim:
- Item-specific license agreed upon to submission
- Açıklama:









