Cookies

We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.


Durham Research Online
You are in:

IEViT: An Enhanced Vision Transformer Architecture for Chest X-ray Image Classification

Okolo, Gabriel Iluebe and Katsigiannis, Stamos and Ramzan, Naeem (2022) 'IEViT: An Enhanced Vision Transformer Architecture for Chest X-ray Image Classification.', Computer Methods and Programs in Biomedicine, 226 . p. 107141.

Abstract

Background and Objective: Chest X-ray imaging is a relatively cheap and accessible diagnostic tool that can assist in the diagnosis of various conditions, including pneumonia, tuberculosis, COVID-19, and others. However, the requirement for expert radiologists to view and interpret chest X-ray images can be a bottleneck, especially in remote and deprived areas. Recent advances in machine learning have made possible the automated diagnosis of chest X-ray scans. In this work, we examine the use of a novel Transformer-based deep learning model for the task of chest X-ray image classification. Methods: We first examine the performance of the Vision Transformer (ViT) state-of-the-art image classification machine learning model for the task of chest X-ray image classification, and then propose and evaluate the Input Enhanced Vision Transformer (IEViT), a novel enhanced Vision Transformer model that can achieve improved performance on chest X-ray images associated with various pathologies. Results: Experiments on four chest X-ray image data sets containing various pathologies (tuberculosis, pneumonia, COVID-19) demonstrated that the proposed IEViT model outperformed ViT for all the data sets and variants examined, achieving an F1-score between 96.39% and 100%, and an improvement over ViT of up to +5.82% in terms of F1-score across the four examined data sets. IEViT’s maximum sensitivity (recall) ranged between 93.50% and 100% across the four data sets, with an improvement over ViT of up to +3%, whereas IEViT’s maximum precision ranged between 97.96% and 100% across the four data sets, with an improvement over ViT of up to +6.41%. Conclusions: Results showed that the proposed IEViT model outperformed all ViT’s variants for all the examined chest X-ray image data sets, demonstrating its superiority and generalisation ability. Given the relatively low cost and the widespread accessibility of chest X-ray imaging, the use of the proposed IEViT model can potentially offer a powerful, but relatively cheap and accessible method for assisting diagnosis using chest X-ray images.

Item Type:Article
Full text:(VoR) Version of Record
Available under License - Creative Commons Attribution Non-commercial No Derivatives 4.0.
Download PDF
(1429Kb)
Status:Peer-reviewed
Publisher Web site:https://doi.org/10.1016/j.cmpb.2022.107141
Publisher statement:© 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Date accepted:14 September 2022
Date deposited:10 November 2022
Date of first online publication:23 September 2022
Date first made open access:10 November 2022

Save or Share this output

Export:
Export
Look up in GoogleScholar