We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

Cotton pan-genome retrieves the lost sequences and genes during domestication and selection.

Li, J. and Yuan, D. and Wang, P. and Wang, Q. and Sun, M. and Liu, Z. and Si, H. and Xu, Z. and Ma, Y. and Zhang, B. and Pei, L. and Tu, L. and Zhu, L. and Chen, L.-L. and Lindsey, K. and Zhang, X. and Jin, S. and Wang, M. (2021) 'Cotton pan-genome retrieves the lost sequences and genes during domestication and selection.', Genome Biology, 22 . p. 119.


Background Millennia of directional human selection has reshaped the genomic architecture of cultivated cotton relative to wild counterparts, but we have limited understanding of the selective retention and fractionation of genomic components. Results We construct a comprehensive genomic variome based on 1961 cottons and identify 456 Mb and 357 Mb of sequence with domestication and improvement selection signals and 162 loci, 84 of which are novel, including 47 loci associated with 16 agronomic traits. Using pan-genome analyses, we identify 32,569 and 8851 non-reference genes lost from Gossypium hirsutum and Gossypium barbadense reference genomes respectively, of which 38.2% (39,278) and 14.2% (11,359) of genes exhibit presence/absence variation (PAV). We document the landscape of PAV selection accompanied by asymmetric gene gain and loss and identify 124 PAVs linked to favorable fiber quality and yield loci. Conclusions This variation repertoire points to genomic divergence during cotton domestication and improvement, which informs the characterization of favorable gene alleles for improved breeding practice using a pan-genome-based approach.

Item Type:Article
Full text:(VoR) Version of Record
Available under License - Creative Commons Attribution 4.0.
Download PDF
Publisher Web site:
Publisher statement:Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Date accepted:14 April 2021
Date deposited:30 April 2021
Date of first online publication:23 April 2021
Date first made open access:30 April 2021

Save or Share this output

Look up in GoogleScholar