Systematic Evaluation of Similarity Metrics for Retrieval, Reranking, and Completion in Retrieval Augmented Generation Systems

Elkıran, Harun; Rasheed, Jawad

doi:10.1109/ETECOM66111.2025.11319066

Systematic Evaluation of Similarity Metrics for Retrieval, Reranking, and Completion in Retrieval Augmented Generation Systems

Tarih

2025

Yazarlar

Elkıran, Harun

Rasheed, Jawad

Yayıncı

Institute of Electrical and Electronics Engineers Inc.

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Two of the major problems with large language models (LLMs) are hallucinations and out-of-context responses. To deal with these problems, Retrieval Augmented Generation (RAG) has emerged as a promising approach. It grounds the output of LLMs in external knowledge. The effectiveness of RAG pipelines depends on several factors, including the choice of similarity metric. This paper presents a systematic evaluation of a comprehensive RAG pipeline that utilizes the Milvus vector database with HNSW indexing techniques in conjunction with OpenAI's embedding models and GPT-based completion. We conducted a comparative analysis of three widely used similarity metrics - Cosine, Inner Product, and L2 - under identical conditions. Based on the results, it was observed that retrieval and reranking performance are highly sensitive to the similarity metrics. Cosine and Inner Product consistently achieve substantially higher recall (R@10 = 0.9092-0.925), Mean Reciprocal Rank (MRR = 0.7806-0.7930), and nDCG (nDCG@10 = 0.8121-0.8252) than L2. In contrast, completion stage metrics such as token usage, cost, and latency remain largely unaffected by the choice of metric. These results underscore the crucial role of retrieval similarity functions in determining RAG effectiveness.

Açıklama

2025 IEEE International Conference on Emerging Trends in Engineering and Computing (ETECOM) / IEEE -- ISBN:979-833156616-6 -- 2025.

Anahtar Kelimeler

HNSW Indexing, Reranking, Retrieval-Augmented Generation, Similarity Metrics, Vector Databases

Kaynak

2025 IEEE International Conference on Emerging Trends in Engineering and Computing, ETECOM 2025

Künye

Elkiran, H., & Rasheed, J. (2025). Systematic evaluation of similarity metrics for retrieval, reranking, and completion in retrieval augmented generation systems. In 2025 IEEE International Conference on Emerging Trends in Engineering and Computing (ETECOM) (pp. 1–5). IEEE. https://doi.org/10.1109/ETECOM66111.2025.11319066

Bağlantı

https://doi.org/10.1109/ETECOM66111.2025.11319066
https://hdl.handle.net/20.500.12436/9402

Koleksiyon

Bilgisayar Mühendisliği Bölümü Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

Systematic Evaluation of Similarity Metrics for Retrieval, Reranking, and Completion in Retrieval Augmented Generation Systems

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Araştırma projeleri

Organizasyon Birimleri

Dergi sayısı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon

Onay

İnceleme

Ekleyen

Referans Veren