FAQ SAGA

FAQ SAGA

My doctoral research was based around what I called SAGA for Species Adaptation Genetic Algorithms. Currently the best introductory paper is here

Origins
The initial motivation was working out how to do incremental artificial evolution. It seemed (still seems) clear that artificial evolution of seriously complex technical devices will be a longterm business, where one simply cannot afford to start from scratch -- from 'origin of life', or a random population -- for each new development or new project. Therefore one will have to evolve incrementally, taking a population evolved for task N and further evolving it for task N+1. This lead to the realisation that one would be always operating with a genetically converged population, or 'species'. And this had consequences for how a GA might work.

However I soon realised that such genetically converged populations were not restricted to incremental evolution, but were in practice universal across all GAs, including standard non-incremental scenarios starting from scratch -- the genetic convergence typically happens within the first 10 generations. This important fact is still almost universally unrecognised in the GA community; partly because the term 'convergence' has a second different meaning (i.e. convergence of fitness to a final value), with this second meaning convergence only comes much later -- and there has been confusion between the two senses of the word. Genetic convergence is different from, typically happens at a different time from, fitness convergence. So the lessons of SAGA are in fact universal.

Implications of SAGA
All GAs have genetically converged populations, after a very short intial transient. The degree of genetic convergence depends primarily on the selection-mutation balance: selective pressure converges, mutation diverges. Recombination and other factors, such as eg assortative mating, will modify this balance without altering the essential picture.

There is an optimal balance that maximises evolutionary search; if the selection pressure is fixed -- linear rank selection being the easiest way -- then one can adjust the mutation rate to get somewhere near the balance (it is not too crucial, being out by a factor of 2 or 3 makes little difference, being out by a factor of 10 or 100 or -- I see it often! -- a factor of a million is bad news). The balance is related to the error catastrophe (Eigen), and allows the population to search along neutral ridges in the fitness landscape -- nowadays called Neutral Networks.

Rule of Thumb
If you are using binary genotypes, and you know or can estimate that a proportion x (<=100%) of the genotype is non-redundant, the rest is (potentially useful in the medium term) junk; and if your selection pressure is at a standard level, equivalent to tournament selection with tournaments of size 2; then the optimum mutation rate is around 1/x mutations per whole genotype (i.e. 1 mutation in the non-junk part). Eg for genotypes of length 100, estimating about 67 loci non-junk and 33 junk, x=67%=0.67, rule of thumb proposes mutation rate of 1.5 per genotype (i.e. mutation rate of 0.015 per locus), implying ~ 1 mutation in the non-junk part. Note carefully the difference between per-genotype and per-locus mutation rates and do not get these confused!

Coincidentally -- or not -- this is not far off what happens in the natural world. Humans have of the order of 1 mutation in the non-junk part of their genotype.

That's it, that's all there really is to SAGA. Use your own GA, apply this rule of thumb. Notice I specified binary genotypes, if you have real values at each locus (eg in computational terms doubles or floats) the analysis is very different.

Is there a SAGA software package?
Good heavens no!!! Use your own GA, apply this rule of thumb. I always use my Microbial GA (discussed in here), because it is simple, effective, maintains an appropriate selective pressure, and I write it in 5 lines of code. If I want to show off, I can write it in one line of code.

Further SAGA Research?
SAGA is heavily based on Neutral Network ideas -- although that term had not been invented at the time. So now I generally call it Neutral Network research rather than SAGA research.

Return to Inman Harvey FAQs
Return to Inman Harvey Home Page