We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

Multi-class 3D object detection within volumetric 3D computed tomography baggage security screening imagery

Wang, Q. and Bhowmik, N. and Breckon, T.P. (2021) 'Multi-class 3D object detection within volumetric 3D computed tomography baggage security screening imagery.', 19th IEEE International Conference on Machine Learning and Applications (ICMLA 2020) Miami, Florida, 14-17 December 2020.


Automatic detection of prohibited objects within passenger baggage is important for aviation security. X-ray Computed Tomography (CT) based 3D imaging is widely used in airports for aviation security screening whilst prior work on automatic prohibited item detection focus primarily on 2D X-ray imagery. Whilst some prior work has proven the possibility of extending deep convolutional neural networks (CNN) based automatic prohibited item detection from 2D X-ray imagery to volumetric 3D CT baggage security screening imagery, it focuses on the detection of one specific type of objects (e.g., either bottles or handguns). As a result, multiple models are needed if more than one type of prohibited item is required to be detected in practice. In this paper, we consider the detection of multiple object categories of interest using one unified framework. To this end, we formulate a more challenging multi-class 3D object detection problem within 3D CT imagery and propose a viable solution (3D RetinaNet) to tackle this problem. To enhance the performance of detection we investigate a variety of strategies including data augmentation and varying backbone networks. Experimentation carried out to provide both quantitative and qualitative evaluations of the proposed approach to multi-class 3D object detection within 3D CT baggage security screening imagery. Experimental results demonstrate the combination of the 3D RetinaNet and a series of favorable strategies can achieve a mean Average Precision (mAP) of 65.3% over five object classes (i.e. bottles, handguns, binoculars, glock frames, iPods). The overall performance is affected by the poor performance on glock frames and iPods due to the lack of data and their resemblance with the baggage clutter.

Item Type:Conference item (Paper)
Full text:(AM) Accepted Manuscript
Download PDF
Publisher Web site:
Publisher statement:© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Date accepted:16 September 2020
Date deposited:27 October 2020
Date of first online publication:23 February 2021
Date first made open access:08 November 2022

Save or Share this output

Look up in GoogleScholar