A Hybrid Vision Transformer with Intra-Attention Architecture for Enhanced Medical Image Retrieval

The rapid growth in medical imaging techniques and the expansionof medical image repositories have created a strong need for accurate image retrieval techniques to efficiently retrieve relevant images. In this approach, a Hybrid Vision Transformer (ViT) Architecture with intra-attention mechanism for enhanced image retrieval. This approach integrates the Convolutional Block Attention Module (CBAM) directly with the multi-head self-attention of Vision Transformer (ViT), enabling more adaptive and fine-grained feature refinement. Unlike traditional fusion-based methods, this model dynamically reweights feature representations by leveraging spatial and channel-wise attention at multiple transformer stages. With spatial attention applied at each stage of MSA, ViT learns to focus more on medically significant image regions, while channel attention enables ViT to prioritize the most informative features and suppress irrelevant information. Experimental results demonstrated the significance of proposed method over standalone features of ViT and other existing methods in terms of improved efficiency, precision and recall. These findings suggest that embedding CBAM within ViT’s self-attention layers can enhance retrieval accuracy while maintaining interpretability, making it a promising solution for medical image analysis.

Açıklama

IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS) / IEEE -- ISBN:979-8-3315-1481-5, 979-8-3315-1480-8 -- 2025.

Kaynak

IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS)

Künye

Sucharitha, G., Rasheed, J., & Potluri, S.. (2025). A Hybrid Vision Transformer with Intra-Attention Architecture for Enhanced Medical Image Retrieval. 1–6. https://doi.org/10.1109/avss65446.2025.11149925

Bağlantı

https://doi.org/10.1109/avss65446.2025.11149925
https://hdl.handle.net/20.500.12436/9623

Koleksiyon

Bilgisayar Mühendisliği Bölümü Koleksiyonu
WoS İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

A Hybrid Vision Transformer with Intra-Attention Architecture for Enhanced Medical Image Retrieval

Dosyalar

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Araştırma projeleri

Organizasyon Birimleri

Dergi sayısı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon

Onay

İnceleme

Ekleyen

Referans Veren