Systematic Evaluation of Similarity Metrics for Retrieval, Reranking, and Completion in Retrieval Augmented Generation Systems

dc.authorscopusid59149323900
dc.authorscopusid57791962400
dc.contributor.authorElkıran, Harun
dc.contributor.authorRasheed, Jawad
dc.contributor.authorRasheed, Jawad
dc.contributor.department-temp
dc.date.accessioned2026-04-14T20:49:17Z
dc.date.issued2025
dc.departmentMühendislik ve Doğa Bilimleri Fakültesi
dc.description2025 IEEE International Conference on Emerging Trends in Engineering and Computing (ETECOM) / IEEE -- ISBN:979-833156616-6 -- 2025.
dc.description.abstractTwo of the major problems with large language models (LLMs) are hallucinations and out-of-context responses. To deal with these problems, Retrieval Augmented Generation (RAG) has emerged as a promising approach. It grounds the output of LLMs in external knowledge. The effectiveness of RAG pipelines depends on several factors, including the choice of similarity metric. This paper presents a systematic evaluation of a comprehensive RAG pipeline that utilizes the Milvus vector database with HNSW indexing techniques in conjunction with OpenAI's embedding models and GPT-based completion. We conducted a comparative analysis of three widely used similarity metrics - Cosine, Inner Product, and L2 - under identical conditions. Based on the results, it was observed that retrieval and reranking performance are highly sensitive to the similarity metrics. Cosine and Inner Product consistently achieve substantially higher recall (R@10 = 0.9092-0.925), Mean Reciprocal Rank (MRR = 0.7806-0.7930), and nDCG (nDCG@10 = 0.8121-0.8252) than L2. In contrast, completion stage metrics such as token usage, cost, and latency remain largely unaffected by the choice of metric. These results underscore the crucial role of retrieval similarity functions in determining RAG effectiveness.
dc.identifier.citationElkiran, H., & Rasheed, J. (2025). Systematic evaluation of similarity metrics for retrieval, reranking, and completion in retrieval augmented generation systems. In 2025 IEEE International Conference on Emerging Trends in Engineering and Computing (ETECOM) (pp. 1–5). IEEE. https://doi.org/10.1109/ETECOM66111.2025.11319066
dc.identifier.doi10.1109/ETECOM66111.2025.11319066
dc.identifier.endpage5
dc.identifier.isbn979-833156616-6
dc.identifier.orcid0000-0002-5834-6210
dc.identifier.orcid0000-0003-3761-1641
dc.identifier.scopus2-s2.0-105033366763
dc.identifier.startpage1
dc.identifier.urihttps://doi.org/10.1109/ETECOM66111.2025.11319066
dc.identifier.urihttps://hdl.handle.net/20.500.12436/9402
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.relation.ispartof2025 IEEE International Conference on Emerging Trends in Engineering and Computing, ETECOM 2025
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectHNSW Indexing
dc.subjectReranking
dc.subjectRetrieval-Augmented Generation
dc.subjectSimilarity Metrics
dc.subjectVector Databases
dc.titleSystematic Evaluation of Similarity Metrics for Retrieval, Reranking, and Completion in Retrieval Augmented Generation Systems
dc.typeConference Object
dspace.entity.typePublication
relation.isAuthorOfPublicationf9b9b46c-d923-42d3-b413-dd851c2e913a
relation.isAuthorOfPublication.latestForDiscoveryf9b9b46c-d923-42d3-b413-dd851c2e913a

Dosyalar

Lisans paketi

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
license.txt
Boyut:
1.17 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: