We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

VID-Trans-ReID: Enhanced Video Transformers for Person Re-identification

Alsehaim, A. and Breckon, T.P. (2022) 'VID-Trans-ReID: Enhanced Video Transformers for Person Re-identification.', BMVC 2022: The 33rd British Machine Vision Conference London, UK, 21-24 Nov 2022.


Video-based person Re-identification (Re-ID) has received increasing attention recently due to its important role within surveillance video analysis. Video-based Re- ID expands upon earlier image-based methods by extracting person features temporally across multiple video image frames. The key challenge within person Re-ID is extracting a robust feature representation that is invariant to the challenges of pose and illumination variation across multiple camera viewpoints. Whilst most contemporary methods use a CNN based methodology, recent advances in vision transformer (ViT) architectures boost fine-grained feature discrimination via the use of both multi-head attention without any loss of feature robustness. To specifically enable ViT architectures to effectively address the challenges of video person Re-ID, we propose two novel modules constructs, Temporal Clip Shift and Shuffled (TCSS) and Video Patch Part Feature (VPPF), that boost the robustness of the resultant Re-ID feature representation. Furthermore, we combine our proposed approach with current best practices spanning both image and video based Re-ID including camera view embedding. Our proposed approach outperforms existing state-of-the-art work on the MARS, PRID2011, and iLIDS-VID Re-ID benchmark datasets achieving 96.36%, 96.63%, 94.67% rank-1 accuracy respectively and achieving 90.25% mAP on MARS.

Item Type:Conference item (Paper)
Full text:(VoR) Version of Record
Download PDF
Publisher Web site:
Publisher statement:© 2022. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.
Date accepted:30 September 2022
Date deposited:13 October 2022
Date of first online publication:21 November 2022
Date first made open access:24 November 2022

Save or Share this output

Look up in GoogleScholar