A Hybrid Vision Transformer with Intra-Attention Architecture for Enhanced Medical Image Retrieval

dc.authorwosidDXR-9356-2022
dc.authorwosidAAY-5207-2020
dc.authorwosidADZ-9019-2022
dc.contributor.authorSucharitha, G.
dc.contributor.authorRasheed, Jawad
dc.contributor.authorPotluri, Sirisha
dc.contributor.authorRasheed, Jawad
dc.contributor.department-temp
dc.date.accessioned2026-06-17T11:24:23Z
dc.date.issued2025
dc.departmentMühendislik ve Doğa Bilimleri Fakültesi
dc.descriptionIEEE Conference on Advanced Video and Signal Based Surveillance (AVSS) / IEEE -- ISBN:979-8-3315-1481-5, 979-8-3315-1480-8 -- 2025.
dc.description.abstractThe rapid growth in medical imaging techniques and the expansionof medical image repositories have created a strong need for accurate image retrieval techniques to efficiently retrieve relevant images. In this approach, a Hybrid Vision Transformer (ViT) Architecture with intra-attention mechanism for enhanced image retrieval. This approach integrates the Convolutional Block Attention Module (CBAM) directly with the multi-head self-attention of Vision Transformer (ViT), enabling more adaptive and fine-grained feature refinement. Unlike traditional fusion-based methods, this model dynamically reweights feature representations by leveraging spatial and channel-wise attention at multiple transformer stages. With spatial attention applied at each stage of MSA, ViT learns to focus more on medically significant image regions, while channel attention enables ViT to prioritize the most informative features and suppress irrelevant information. Experimental results demonstrated the significance of proposed method over standalone features of ViT and other existing methods in terms of improved efficiency, precision and recall. These findings suggest that embedding CBAM within ViT’s self-attention layers can enhance retrieval accuracy while maintaining interpretability, making it a promising solution for medical image analysis.
dc.identifier.citationSucharitha, G., Rasheed, J., & Potluri, S.. (2025). A Hybrid Vision Transformer with Intra-Attention Architecture for Enhanced Medical Image Retrieval. 1–6. https://doi.org/10.1109/avss65446.2025.11149925
dc.identifier.doi10.1109/avss65446.2025.11149925
dc.identifier.endpage6
dc.identifier.isbn979-8-3315-1481-5
dc.identifier.isbn979-8-3315-1480-8
dc.identifier.issn2643-6205
dc.identifier.orcid0000-0003-3761-1641
dc.identifier.startpage1
dc.identifier.urihttps://doi.org/10.1109/avss65446.2025.11149925
dc.identifier.urihttps://hdl.handle.net/20.500.12436/9623
dc.identifier.wosWOS:001588601200066
dc.indekslendigikaynakWeb of Science
dc.language.isoen
dc.publisherIEEE
dc.relation.ispartofIEEE Conference on Advanced Video and Signal Based Surveillance (AVSS)
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.titleA Hybrid Vision Transformer with Intra-Attention Architecture for Enhanced Medical Image Retrieval
dc.typeConference Object
dspace.entity.typePublication
relation.isAuthorOfPublicationf9b9b46c-d923-42d3-b413-dd851c2e913a
relation.isAuthorOfPublication.latestForDiscoveryf9b9b46c-d923-42d3-b413-dd851c2e913a

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
Sucharitha-2025-A-hybrid-vision-transformer-with-in.pdf
Boyut:
674.29 KB
Biçim:
Adobe Portable Document Format
Açıklama:
Proceedings file

Lisans paketi

Listeleniyor 1 - 1 / 1
Yükleniyor...
Küçük Resim
İsim:
license.txt
Boyut:
1.17 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: