Cosmic Inflation and Genetic Algorithms

Large classes of standard single-field slow-roll inflationary models consistent with the required number of e-folds, the current bounds on the spectral index of scalar perturbations, the tensor-to-scalar ratio, and the scale of inflation can be efficiently constructed using genetic algorithms. The setup is modular and can be easily adapted to include further phenomenological constraints. A semi-comprehensive search for sextic polynomial potentials results in roughly O(300,000) viable models for inflation. The analysis of this dataset reveals a preference for models with a tensor-to-scalar ratio in the range 0.0001<r<0.0004. We also consider potentials that involve cosine and exponential terms. In the last part we explore more complex methods of search relying on reinforcement learning and genetic programming. While reinforcement learning proves more difficult to use in this context, the genetic programming approach has the potential to uncover a multitude of viable inflationary models with new functional forms.


I. INTRODUCTION
The idea of cosmic inflation, a period of exponential growth for the scale factor of the Universe believed to have taken place between 10 −36 and 10 −32 seconds after the Big Bang singularity, was originally introduced to explain the initial conditions for the hot Big Bang model, which would otherwise lead to a number of conundrums such as the horizon problem, the flatness problem and the monopole (and other relics) problem [1][2][3][4][5]. Moreover, the theory provides a plausible account for the origin of structure in the Universe and explains the observed cosmic microwave background anisotropies by providing a mechanism for the generation of density perturbations [6][7][8][9][10].
The inflationary epoch is typically modelled as a quasiexponential expansion of the very early universe caused by the slow-roll of a scalar field φ, called the inflaton, in its evolution towards the minimum of a nearly flat potential V (φ). In this slow-roll regime, all the derivatives of the scalar field are negligible in comparison with the potential V (φ), interpreted as an approximately constant vacuum energy. The effective energy density and pressure of a homogeneous scalar field φ(t) are hence in the slow-roll regime p φ ≈ −ρ φ , approximating the equation of state associated with a cosmological constant term. The expression for V (φ) should in principle be derived from a fundamental theory such as string theory, however this task entails myriad subtleties and difficulties. This paper will take a phenomenological approach therefore, and consider certain, suitably parametrised classes of functions V (φ).
The concrete properties of the inflaton potential that yield a viable phenomenology are that it must vanish at its global minimum and should have a nearly flat region near the minimum to induce the standard slow-roll dynamics. Within this region, the starting point for inflation must be sufficiently far from the global minimum to generate a large enough expansion in order to solve the cosmological problems of the Big Bang model. The minimum required amount of inflation is about 70 e-folds, which corresponds to an expansion by a factor of 10 30 . All estimates of the minimal number of e-folds suffer from uncertainties of up to 10 e-folds, depending on the details of reheating.
More stringent constraints on the inflationary potential come from measurements of the CMB anisotropies which give information about the primordial density perturbations and gravitational waves. The results included in the 2018 Planck cosmological release [11] are compatible with a simple power law spectrum for the scalar perturbations with an exponent n s − 1 and a measured value of the spectral index n s = 0.9649 ± 0.0042 at the pivot scale k * = 0.05 Mpc −1 . The collaboration has also set tight constraints on the amount of inflationary gravitational waves with an upper limit on the tensor-to-scalar ratio r < 0.061. The bound on r can be rephrased as an upper bound on the energy scale of inflation V * when the pivot scale exited the Hubble radius given by V * < (1.6 · 10 16 GeV) 4 [11]. However, the energy scale of inflation is fixed by the normalisation of the present-day power spectrum of density perturbations (V * / * ) 1/4 = 6.7·10 16 GeV with a theoretical uncertainty of about 10% [12], where is one of the slow-roll parameters. In general, inflationary models predict a wide range of values for the number of e-folds, n s , r and the energy scale of inflation, and many of them can be ruled out on this basis.
Finding inflationary potentials, possibly falling within some class that is motivated by an underlying theory, that satisfy all these conditions is a nontrivial task. For example, it involves performing an integral to determine the number of e-folds. Such a task is not amenable to direct optimisation techniques such as gradient descent, simulated annealing, or more direct forms of machine learning. It is not a simple regression problem and it is not, for example, easy to express the problem directly in a single loss-function. Moreover finding good solutions is made more difficult by the fact that functions may have "local optima", namely regions that satisfy some but not all the conditions for inflationary potentials, which may be widely separated from good solutions by large "barren" regions. The solving of such convoluted systems is much more naturally expressed as the optimisation of a "fitness" within some "environment", which is comprised of the set of operations that must be performed on the proposed function in order to tell if it is a good solution or not. This suggests using either reinforcement learning (RL) (for a review see [13]) or genetic algorithms (GAs) [14][15][16][17][18][19][20][21][22][23] as a means of finding solutions.
The purpose of this paper is to show that large classes of viable inflationary potentials can indeed be constructed using GAs. Although GAs have been used, somewhat sporadically, in the context of particle physics and string theory [24][25][26][27][28][29][30][31][32][33][34][35][36], to our knowledge, they have not yet been used for the purpose of building cosmological models. In this paper we will be focusing on this specific goal of determining viable inflationary potentials. We will consider potentials whose functional dependence includes polynomials, exponentials and trigonometric functions as building blocks.
Within each class of functions, parametrised by a set of coefficients, we will search for those coefficient values that lead to viable inflationary models. The numerical ranges for the coefficients are discretised, turning the search into a discrete optimisation problem that is amenable to heuristic methods such as GAs or RL. GAs prove to be extremely efficient in identifying many viable potentials, while we find that RL turns out to be less successful. Thus GAs appear to provide a useful tool for inflationary model building.
Although GAs are the main focus of this paper, it is natural to ask whether more sophisticated forms of symbolic regression could be employed for the same purpose. In the final section we comment on the possible implementation of more elaborate algorithms that incorporate the evolutionary version of symbolic regression, namely genetic programming (GP). This could in principle simultaneously optimise for the functional dependence of the potential and its numerical coefficients. We present arguments suggesting that GP should be effective for the task of constructing simple inflationary potentials using a set of allowed operands if it is implemented in conjunction with the GA using speciation, that is the GP works at the level of a population of functions, while the GA is used to determine the coefficients in each function by optimising its fitness.

A. Binary encoding of inflationary potentials
In this study we will concentrate on three different classes of potentials: (1) polynomial, (2) polynomial + cos(cφ), and (3) polynomial + e cφ × polynomial, where all the polynomials are chosen to have a fixed degree. Potentials of this type have been previously studied in the literature in various theoretical contexts.
The polynomial coefficients, as well as the numerical factor c in the argument of the exponential or the cosine represent the parameters of the model. In the study that we present in Section III, we will specify certain ranges of possible values for each parameter and then discretise them. The set of all the discrete parameter values forms our space of states {s}. The polynomial coefficients must be allowed to vary over several orders of magnitude and, accordingly, their discretisation will be spaced on a logarithmic scale. In view of the genetic algorithm implementation to be discussed below, the number of partitions is always chosen to be a power of 2 such that, for any given parameter, each discrete value can be assigned a binary label. Finally, all the labels are concatenated together into one long binary string that encodes the inflationary model. All the models belonging to the same functional class are encoded on strings of equal lengths.

B. The genetic algorithm
The starting point of a GA is the creation of an initial random population with a certain number, N pop , of individuals. This is done by generating N pop binary strings of the appropriate length, referred to as the genotype, each representing an inflationary potential, referred to as the phenotype, in the class of functions under consideration. This population is then evolved using the three main ingredients specific to GAs: selection, breeding and mutation. The optimal size of the population is correlated with the length of the binary string and in our case N pop will be of order of a few hundred individuals.
Selection proceeds by ranking the individuals according to their fitness value. In our case the fitness value will be a measure of how compatible a given model of inflation is with the current cosmological observables. The detailed form of the fitness function is discussed below. Subsequently, individuals are selected for breeding with a probability that depends linearly on their ranking by fitness, such that the probability of the k th individual being selected for breeding is In particular, the probability P 1 for the top individual is equal to a multiple α of the probability P Npop of the least fit individual. Typically, α is chosen in the range 2 ≤ α ≤ 5. While the fittest individuals have a higher chance to produce offsprings in the next generation, the scheme also allows less fit individuals to breed. This ensures a variety of "genes" is available throughout the evolution. Breeding is implemented as an M -point cross-over by which the two binary sequences are cut at the same M random points and the cut sections are swapped in an alternating fashion. In our case a single point crossover performs well enough and we will consider this simple choice. Finally, once a new generation is formed, a small fraction of the binary digits, selected randomly, are flipped, to ensure that the population does not stagnate. A typical value for the mutation rate is one percent.
The population is evolved for multiple generations, typically a few hundred. If successful, the algorithm produces a final population where a large fraction of the individuals correspond to viable solutions. The search is then restarted with a new random population. As we will see, this form of genetic algorithms is capable of quickly identifying many high-quality solutions within each evolutionary cycle, even though the fraction of viable solutions is typically very small compared to the size of the state space. This remarkable property of GAs leads to a significant reduction of computing time, as compared to systematic scanning procedures.
In the present context it is worth mentioning a possible but inadvisable modification, because the reason for its failure is instructive. It is tempting when dealing with a set of discretised parameters to divide the genotype into sections ("chromosomes" perhaps) with one section corresponding to each parameter. Then one could implement a cross-over by a one-point cross-over between all the corresponding chromosomes of the two individuals. However, this is exactly the wrong thing to do in a GA: when the parameters are arranged into one long string with only a single cross over, during breeding individuals will be swapping several parameters wholesale as well as mixing up the single parameter that contains the cross-over. Such swapping of large chunks of genotype is an important aspect of GAs. If one attempts to split the genome into sections that represent parameters, one is in a sense allowing the phenotype to "constrict" the genotype and this greatly diminishes the efficacy of the GA.

C. The fitness value for inflationary models
There are several ways in which a potential can fail to be a viable model of inflation. By analysing these failures it is possible to come up with a sensible definition for the fitness function required for the GA implementation. We will adopt the view that the potential V (φ) is an effective field theory description for the dynamics of φ and that its expression can only be trusted in a relatively small range I φ of inflaton values, which we choose to be the interval 30,30] in Planck units. (We will be working in units where the reduced Planck mass 1/ √ 8πG N has been set to one.) The potential must have a minimum within this range. Moreover, as the entire inflationary trajectory must fall within I φ , we require that the position of the lowest minimum φ m within I φ (which from now on will be referred to as the "global" minimum) falls inside a smaller interval I φm , which we choose as I φm = [ −5, 5]. When generating potentials, we do not require the vacuum value at the global minimum to vanish but simply adjust the additive constant in V (which is not considered part of the state space) such that V (φ m ) = 0. If V has no minima inside I φm , the fitness value of the model is set to a fixed negative value ("no-vacuum penalty").
Once the global minimum φ m has been found, we identify the closest maxima to the right and to the left of φ m , since the inflaton can in principle roll down from either side. The regions between the global minimum and the nearest maxima are of interest for inflation. If no maximum is found within I φ (to the right or to the left), the region of interest is extended up to the corresponding boundary of I φ . The fitness function is computed for both the left and the right trajectories, and the trajectory with the larger fitness is adopted as the relevant one for the given potential.
To discuss our algorithm further we need to introduce the notion of the slow-roll approximation; which is characterised by the smallness, 1 and η 1, of the slow-roll parameters In general, when the slow-roll conditions are satisfied inflation occurs. We begin our algorithmic exploration of the potential close to the global minimum φ m where V is small and hence the slow-roll conditions are violated. Moving away from the global minimum in sufficiently small discrete steps ∆φ we seek the value of φ closest to φ m where the slow-roll parameters drop below a certain threshold, for concreteness taken to be The corresponding value φ e provides the approximate field value which marks the end of inflation. To find the starting point of inflation, we continue to move up the potential in steps ∆φ and continue to check the slow-roll conditions. At the same time, in order to avoid the production of new inflationary domains which expand at faster rate (eternal inflation), we require that the quantum fluctuations of φ during a typical time interval H −1 are much smaller than the change in the field due to its classical motion, that is In practice, we check that δφ q < 0.5 δφ c . As long as the conditions in Eqs. (II.3) and (II.4) are satisfied, the standard slow-roll inflation approximations can be trusted. Having obtained the values of the field corresponding to the start and end of inflation, we proceed to the computation of the cosmological observables. First, we compute the total number of e-folds, This number should be large enough to solve the standard cosmological problems of the Big Bang model. More importantly, it should be larger than the number of e-folds separating the end of inflation φ e from the reference value φ * when the cosmological scales of interest (that is, the scales probed by the Planck satellite) leave the horizon. The useful formula to compute this number of e-folds (assuming the reheating scale is not too far away from ∼ 10 10 GeV) is [37] which can be computed as soon as the field value φ e at the end of inflation is known. On the other hand, from which the value of the field φ * at the pivot scale can be obtained. For an acceptable model of inflation we require N total ≥ N * . In order to penalise a model which violates this condition, the fitness functions receives a contribution where θ(x) is the Heaviside function. Having obtained the value of the field φ * at the pivot scale, we can compute the cosmological observables To enforce the correct spectral index, the fitness function includes the term wheref ns (s) = − log 10 1 + |n s, * − n s | ∆n s , (II.11) and the measured value is n s ± ∆n s = 0.9649 ± 0.0042.
Here, f term is a fixed value (see below) such that models withf ns (s) > f term attract a vanishing fitness contribution and are considered leading to a viable spectral index.
If the tensor-to-scalar ratio r * is greater than the maximal valued r max = 0.061 consistent with the 2018 Planck release, the fitness receives a further penalty of Finally, we compute m * = (V * / * ) 1/4 and to ensure that the model matches the energy scale of inflation m 0.027 (in Planck units) we add the term to the fitness function, where we have set ∆m = 0.1 m to account for theoretical uncertainties.
The fitness of a state s is then given by the sum of the above contributions, that is, 3. This fitness function is linked to the GA Mathematica package Genetic which was developed by the authors and which is available for download [38]. The results discussed in the following section have been obtained within this setup.

A. Polynomial potentials
One of the simplest classes of functions that can be considered are polynomials. These can arise in inflationary models in their own right or as truncated Taylor expansions around the global minimum of more general perturbative potentials. For concreteness, we restrict to polynomials of degree six, V (φ) = c 0 +c 1 φ+c 2 φ 2 +c 3 φ 3 +c 4 φ 4 +c 5 φ 5 +c 6 φ 6 . (III.1) The constant term c 0 is fixed by requiring that V (φ) vanishes at the global minimum, as discussed in Section II C. The GA search is then set up to identify optimal values for the remaining six constants c 1 , c 2 , . . . , c 6 , which run over the following ranges: length 36 for each state. Hence, the state space consists of a total of 2 36 7 · 10 10 models. The GA is run for a population of N pop = 100 individuals, a mutation rate of 1% and a coefficient α = 3. A typical evolution over 50 generations leads to O(10) different viable models. This is extremely efficient, given that only about 5000 models are visited in the course of such an evolution and, by comparison, a typical random selection of 5000 models would contain no viable potentials at all. Figure 1 shows the evolution of the fraction of terminal models in the course of a typical run, which takes about 200 s to complete on a standard laptop.
To obtain a more comprehensive dataset of viable polynomial potentials of degree six, we have performed 70, 000 runs, each starting from a random initial population. The total number of visited states is ∼ 10 8 , which amounts to ∼ 1% of the search space. The search produced 309, 097 different viable models. Figure 2 shows a plot of the number of distinct viable models (that is, the number of viable models after eliminating repetitions from previous runs) as a function of the number of visited states. The curve indicates a progression towards saturation, suggesting that a sizeable fraction of the viable models have been found, although even for this simple model it was not possible to reach complete saturation.
This particular model appears with an effective seven-fold degeneracy without our search, corresponding to a cluster of "nearby" models which only differ by tiny changes in the parameters. In principle the search can be refined to remove such effective degeneracies in parameter space. In terms of the GA a modification to remove such degeneracy could also be implemented directly using a crowding penalty (where the crowding in question is in the phenotype). Unfortunately such a modification would be very costly in search time: for each new solution one would have to compute its distance to all of the O(10 5 ) previous solutions. Therefore we present the raw counts, with the understanding that the number of non-degenerate solutions may be an order of magnitude fewer than the raw number. In addition we do not divide out for solutions that are related by a displacement in φ. A potentially much more efficient method for separating the results into independent solutions would be post-processing, first by separating them into bins with different numbers of e-folds and r * , and then using a clustering algorithm on the bins individually. We will not attempt this in the present study.

B. Statistics on the class of polynomial potentials
The large number of viable inflationary potentials found during the GA search (∼ 0.3 · 10 6 ) allows us to generate some meaningful statistics. We can first look at the field range during which the universe expands by N * e-folds before the end of inflation. The histogram for the minimum field excursion |φ * − φ m | during infla-tion 1 is displayed in Figure 5 and shows two peaks, one corresponding to small field inflation (an ∼ O(1) field excursion in Planck units) and another one corresponding to large field inflation (an ∼ O(10) field excursion). It is interesting to note that a very small fraction of the models exhibit mid-range field excursions or excursions larger than 15 Planck units. The distribution for the spectral index n s , shown in Figure 6, is approximately uniform, indicating no particular preference shown by n s within the allowed range. On the other hand, the distribution for the tensor-to-scalar ratio r * , shown in Figure 7, indicates a strong preference towards models with a small value r * < 0.002. Zooming in on the r-histogram (see Figure 8), we note that the most favoured value of r * lies in the range 0.0001 ≤ r * ≤ 0.0004, which can be regarded as a prediction for the value of the tensor-to-scalar ratio and a hint for why primordial gravitational waves have not yet been detected.

C. Polynomial + Cosine
Potentials involving a cosine function appear in what is known as 'natural inflation' in which the inflaton field is a pseudo-Nambu Goldstone boson appearing after the breaking of a shift symmetry (see for example Ref. [39] for a review of the main types of inflation proposed over the years and their theoretical justification). Here we use a generalisation of this type of potential which is of the form V (φ) = c 0 +c 1 φ+c 2 φ 2 +c 3 φ 3 +c 4 φ 4 +c 5 cos(c 6 φ). (III. 5) As in the previous case, the constant term c 0 is fixed by requiring that V (φ) vanishes at the global minimum. The intervals for c 1 , c 2 , c 3 , c 4 and c 5 are again divided into 64 values, equally spaced logarithmically. For c 6 , which enters the argument of the cosine, the interval is divided in 64 values, equally spaced. As before, an inflaton potential is, therefore, described by a binary list of length 36 and the state space consists of 2 36 7 · 10 10 models.
With the same GA settings as above, the computational time for a single run is comparable to that of the sextic polynomial case. A typical run results in O(10) different viable models. A typical example in this class is shown in Figure 9 and corresponds to the potential V (φ) = 2.80939 · 10 −11 + 5.92241 · 10 −14 φ (III.7)

D. Polynomial + Exp × polynomial
Finally we consider inflationary potentials motivated by Kähler moduli inflation in string theory. The functional form considered is (III.8) As before, the constant term c 0 is fixed by requiring that V (φ) vanishes at the global minimum. The search ranges for the remaining ten parameters is chosen as follows: For all the coefficients except c 5 the intervals are divided into 64 values, equally spaced logarithmically, while 64 equally spaced values are used for c 5 . Hence, each state is described by a binary string of length 60 and the total size of the state space is 2 60 10 18 . This increase in the size of the search space, compared to the previous cases, does not pose a problem for the GA. With the same GA settings as before, N pop = 100 individuals, a mutation rate of 1% and a coefficient α = 3, a typical evolution over 50 generations leads to O(1000) different viable models and completes within a few minutes on a standard laptop. Compared to the pure sextic polynomial case, viable potentials are easier to find in this setting. A modest number of 1000 runs leads to over a million distinct viable potentials.

A. Reinforcement learning
It is possible to interpret the space {s} of inflationary models as an environment for the purpose of reinforce-ment learning. To this end we consider actions a : s → s , which map a state s to a new state s by changing the value of one of the coefficients in s to the next smaller or larger one while leaving all other coefficients unchanged. The reward R(s, a) of such an action can, for example, be defined as (IV.1) where f is the fitness function for states defined previously and R offset is a fixed (negative) value which penalises a decrease in fitness. This environment can then be coupled to one of the standard policy-based RL algorithms, such as REINFORCE or actor-critic (for a review see Refs. [13,40]).
Unfortunately, we find it difficult to train a neural network for such an inflationary environment, using either algorithm. After careful adjustment of the fitness function we have managed to train successfully for a two-dimensional toy environment, consisting of potentials with just a quadratic and quartic term. We have not been able to achieve a successful training on any of the larger environments which were so efficiently handled by GAs. Intuitively, we attribute this failure to the "choppiness" of the fitness function which can arise, for example, due to new global minima developing elsewhere in the potential, even for small parameter changes. Such discontinuities, clearly easily handled by GAs, can lead to unintended incentives for RL. The successful run for the two-dimensional toy environment suggests it is still conceivable that RL could be made to work on larger inflationary environments (see also the approach of Ref. [41] where gradient ascent in the number of e-folds was used to optimise the coefficients). This would probably require careful adjustment of the fitness function and systematic hyper-parameter optimisation but, given the efficiency of GAs, we have not attempted this.

B. Genetic programming
So far we have considered potentials of certain fixed functional forms, focusing on the optimisation of the numerical constants that can generate an inflationary potential. However in practice one may not know the functional form of the potential, but only the kinds of terms that can enter, for example polynomials for perturbative contributions, trigonometric functions for axionic potentials, exponentials for nonperturbative processes and so forth. The way in which these components appear may vary depending on the physical set-up. Thus it is interesting to consider a more generic approach that can simultaneously optimise for both the functional form and the numerical factors.
For this purpose the paradigm of genetic programming (GP) is the most promising setting. GP operates with trees of symbolic expressions and numerical factors.
For example the tree form of the simple cosine potential V (φ) = 2 + cos(φ/3) is shown in Figure 12, and the tree for the inflationary cosine potential in Eq. III.7 is shown in Figure 13. GP is an evolutionary algorithm that operates on populations of such trees and it shares with GAs the essential features of selection, crossover and mutation [42][43][44][45]. It has been used successfully for many different tasks including for example symbolic regression (for a recent review that includes benchmark problems see Ref. [46]). Ideally one would operate to produce a successful tree such as that in Figure 13 as follows. Exactly as in a GA one first generates a random population, but in this case a population of trees, with the vertices being selected from a set of allowed operands, for example { Times, Plus, Power, Cos }, and the leaves being either constants or the variable φ. Every tree can be ranked and selected for breeding as described already for the GA. However the process for breeding follows various well defined rules, with for example crossover implemented as a swap of subtrees (see Ref. [45] for a review). These can take many more forms than a simple one-point exchange of subtrees (so-called homologous cross-over), with subtrees possibly being moved to entirely different positions in the tree. After breeding, there is traditionally also a mutation stage. The effect of mutation is less clear cut in GPs and indeed it was even omitted from Koza's original work [43], however typically it is implemented and again can take many different forms. For instance, mutation can affect individual leaves or entire subtrees (that is, a subtree could be replaced by a random subtree of comparable complexity). With these rules in place, the evolutionary process then adheres to the same flow-chart as for the GA.
Considering the case at hand, our experience with GAs indicates that it is crucial first to be clear about the question one would be trying to ask in this expanded framework. In particular if the form of the functions is incorporated into the search then the search space becomes virtually infinite. As we have seen, a suitable choice of coefficients can, for different functional forms, already yield an inflationary potential, so without specifying very carefully within the GP what we actually wish to achieve, the form of most potentials will be very inelegant. One can see the potential problems that can arise if we consider the more controlled example of the regression of a set of data that one already knows the answer to: for example the symbolic regression of data produced by our toy function V (φ) = 2 + cos(φ/3) (more precisely a set of fake data produced from say 100 random choices of φ).
We can ask the GP to fit this data, by for example defining the fitness function to be a monotonically decreasing function of the quadratic loss-function. However if we allow coefficients to be defined as we have been doing with 64 logarithmically spaced possible increments, then we quickly find that the algorithm runs into a severe version of the "bloating" phenomenon, in which expressions grow to progressively larger complexities. This suggests that when it comes to finding inflationary potentials, the precise question we should be asking is, given a particular set of operators that we know can be motivated by underlying physics, for example the aforementioned { Times, Plus, Power, Cos } operands, what is the simplest form of potential that can produce inflation?
In principle imposing simplicity on our expressions can reduce bloat. One can for example insert a fitness penalty that increases with the tree-depth. This is typically done in symbolic regression in order to avoid the GP equivalent of over-fitting. In the case of inflationary model building this trade-off would ideally render potentials of relatively low complexity, in line with the general preference towards finding simple underlying theoretical justifications, which can in addition correctly recover the allowed values for the cosmological observables.
Unfortunately the results for fitting the toy potential from its data are not encouraging. It is not possible to reduce bloat by simply adding a complexity penalty if one wishes to allow almost free-floating constants to exist within the GP. This is a well known issue when there are several constants to be determined, due to the functional form of the GP converging faster than the constants (see for example Ref [47] for a related discussion). In the case of symbolic regression it is possible to circumvent this problem if the constants can instead be optimised independently by for example nonlinear regression. In the present example, one would in practice allow the numerical coefficients to be undefined parameters in the tree, with the corresponding leaves filled with 'placeholders'. In order to work out the fitness of any particular tree in the population, one would then determine the set of coefficients that separately maximises the fitness function in a similar fashion to that described in Refs. [48][49][50]. Indeed we find that this method works well in our toy symbolic regression problem.
The present problem is, as we have stressed, more general than a symbolic regression, however the examples above suggest ways forwards. Implementing GPs to find optimal functions should be organised in a nested fashion, similar to a technique that has been called speciation in the literature [51][52][53], and a related technique called niching. The idea of speciation as it would be applied to the inflationary potential problem is as follows. The evolutionary algorithm is arranged in two stages. The first is the GP, organised as described above, with any constants in the tree represented by the aforementioned placeholders. Then, in order to calculate the fitness of all the trees in a generation, one has to run a GA on each individual function in the population (which effectively defines a sub-population within a niche) to optimise its fitness, following the procedure described in the main body of this paper. This simultaneously determines the values of all the constants for each function.
The above procedure is guaranteed to work (modulo the need to implement good ranges for the constants), because we know each half of the procedure works separately, so in that sense it constitutes a "no-lose theorem". However it could clearly end up being computationally expensive. If the GP part of the problem requires a population of N GP niches, then one might expect to incur a cost of a factor of at least N GP compared to the GA. One could possibly reduce the penalty if the GP is still "niched", but with each niche containing fewer members than the individual GA, the rationale being that niching merely needs to speed up the convergence of constants enough to overcome the tendency to bloat. We shall return to these more ambitious approaches in a future publication.

V. CONCLUSIONS
This note has expanded the range of computational tools available in inflationary model building to include genetic algorithms. We have shown that GAs can be efficiently used to identify models of inflation that are consistent with the required number of e-folds, the current bounds on the spectral index of scalar perturbations, the tensor-to-scalar ratio, and the scale of inflation. Using moderate computational resources we have been able to construct millions of viable models, however many more can be obtained in this way.
In particular, we have focused on three functional classes of potentials: (1) sextic polynomials, (2) quartic polynomials + cosine and (3) quartic + exp × quartic. In the case of pure polynomial potentials we attempted a semi-comprehensive search which resulted in the interesting observation that low-r models are favoured.
Reinforcement learning, which in other contexts has been shown to perform comparably well [33][34][35], was difficult to use for the present problem. On the other hand, methods involving genetic programming and symbolic regression seem promising approaches to explore. They may lead to to viable potentials with new functional forms and could provide fresh inspiration for model building.