We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

A sparse regression approach to modelling the relation between galaxy stellar masses and their host haloes

Icaza-Lizaola, M and Bower, Richard G and Norberg, Peder and Cole, Shaun and Schaller, Matthieu and Egan, Stefan (2021) 'A sparse regression approach to modelling the relation between galaxy stellar masses and their host haloes.', Monthly Notices of the Royal Astronomical Society, 507 (3). pp. 4584-4602.


Sparse regression algorithms have been proposed as the appropriate framework to model the governing equations of a system from data, without needing prior knowledge of the underlying physics. In this work, we use sparse regression to build an accurate and explainable model of the stellar mass of central galaxies given properties of their host dark matter (DM) halo. Our data set comprises 9521 central galaxies from the EAGLE hydrodynamic simulation. By matching the host haloes to a DM-only simulation, we collect the halo mass and specific angular momentum at present time and for their main progenitors in 10 redshift bins from z = 0 to z = 4. The principal component of our governing equation is a third-order polynomial of the host halo mass, which models the stellar-mass–halo-mass relation. The scatter about this relation is driven by the halo mass evolution and is captured by second- and third-order correlations of the halo mass evolution with the present halo mass. An advantage of sparse regression approaches is that unnecessary terms are removed. Although we include information on halo specific angular momentum, these parameters are discarded by our methodology. This suggests that halo angular momentum has little connection to galaxy formation efficiency. Our model has a root mean square error (RMSE) of 0.167log10(M*/M⊙), and accurately reproduces both the stellar mass function and central galaxy correlation function of EAGLE. The methodology appears to be an encouraging approach for populating the haloes of DM-only simulations with galaxies, and we discuss the next steps that are required.

Item Type:Article
Full text:(VoR) Version of Record
Available under License - Creative Commons Attribution 4.0.
Download PDF
Publisher Web site:
Publisher statement:© The Author(s) 2021. Published by Oxford University Press on behalf of Royal Astronomical Society. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Date accepted:12 August 2021
Date deposited:24 November 2021
Date of first online publication:19 August 2021
Date first made open access:24 November 2021

Save or Share this output

Look up in GoogleScholar