Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

License

NotificationsYou must be signed in to change notification settings

nl253/GeneticAlgo

Repository files navigation

  • use when search space is too large to use brute-force search
    • e.g. solving equations, automating the process of design and solvingcombinatorial problems (timetable scheduling)
    • manyproblems can be reformulated as exploring an n-dimensional search space
  • adaptive parameters that change linearly as time approaches end and in response to quality of candidate solution
  • elitism (preserves top candidates)
  • allows for profiling and debugging (seeEventEmitter API)
  • efficient (built on typed arrays)

For analternative heuristic search that may work better when yourproblem uses continuous (real) values see myparticle swarm optimization algorithmthat follows a similar API.

Installation

npm install genetic-algo

NPM link.

API

Example:

const{GeneticAlgorithm:GA}=require('genetic-algo')// silly fitness function, maximises values of all genes (see below for a better example)constfitnessFunction=arr=>arr.reduce((x,y)=>x+y,0)// you may also supply an object with options  see below DEFAULT OPTS)constga=newGA(fitnessFunction,1000/* nGenes */,'u32'/* dtype */)// Array<TypedArray>constbestCandidates=Array.from(ga.search()/* generator */)

In a nutshell:

  1. SpecifynGenes (Int > 0, seeNGENES section below)
  2. Declare a fitness function that accepts a candidate (typed array) andreturns a number. Each candidate is of lengthnGenes. The candidatesthat score the highest will be favoured in theselection and will make it to the next gene pool. (seeFITNESS FUNCTION section below)For multiple objectives (e.g. you want the solution to have more than one property: price, time, quality ...) supply an array of fitness functions.
  3. Choosedtype, one of:"f64" | "f32" | "i32" | "i16" | "i8" | "u32" | "u16" | "u8" (seeDTYPE section below)
  4. [EXTRA] You might want adecode function as well (seeDECODE FUNCTION section below).

Fitness Funct

Signature

function(TypedArray): Number

The number it returns may be positive or negative. It may be an integeror a real number.

Example

The previous example maximised the value of every gene. This examplecomputes the negative of the distance from roots of an equation:

// find root of exprconstexpr=(x1,x2,x3,x4,x5,x6)=>(Math.log2(x1)*x2**x3/x4)+x5**(Math.log2(x6))constfitness=xs=>{constval=-(Math.abs(expr(...xs)))if(Object.is(NaN,val)||Object.is(Infinity,val)){return-Infinity}else{returnval}}

Fittest candidates score 0 (distance from the root is 0 meaning root hasbeen found), least fit candidates have a negative value.

Output fromthis example which uses this fitness function:

log2(98)*0^61/209+0^log2(76)=0log2(39)*0^228/209+0^log2(160)=0log2(100)*0^89/202+0^log2(151)=0log2(124)*0^163/247+0^log2(76)=0log2(31)*0^166/9+0^log2(166)=0log2(221)*0^100/132+0^log2(130)=0log2(2)*0^157/211+0^log2(150)=0log2(2)*0^100/132+0^log2(130)=0........................

It's crucial that you reward candidate solutions forapproximating i.e.getting close to the solution. If they are a bit right -- add some fitness.

Multiple Objectives

For multiple objectives (e.g. you want the solution to have more than one property: price, time, quality ...) supply an array of fitness functions:

constfitness=[(cand)=>getSpeed(cand),(cand)=>getPrice(cand),(cand)=>getDurability(cand),(cand)=>getUserFriendliness(cand),]

The functions may return numbers in any scale. E.g.getSpeed can return a number in [0, 100],getPrice can return a number in [100, 1000000] etc.You might want to provide weights for each objective (by default each is 1.0 i.e. equally important):

constopts={weights:[0.2,0.4,0.1,0.01]// default [1.0, 1.0, 1.0, 1.0]}

[OPTIONAL] Decode Function

It sometimes makes sense to have adecode(cand) function.

functiondecode(cand){return{price:cand[0],category:Math.floor(cand[1]),area:Math.floor(cand[2]),// etc.}}

And then it'smuch easier in the fitness function:

functionfitnessFunction(cand){const{ price, category, area, ...}=decode(cand)letfitnessScore=0fitnessScore+=1000-pricefitnessScore+=getQualOfCat(category)fitnessScore-=getCostOfArea(area)// other vars ...returnfitnessScore}

More exampleshere.

NGenes

This is how many numbers each array will have. Each gene (number)corresponds to a dimension in the search space you are exploring. For example:

#meaningdomain
gene #1time00:00 - 24:00
gene #2day0 - 365
gene #3room number1 - 128
.........
gene #1000buildingA - Z

For combinatorial problems, it makes sense to store an array of choicesand let genes be indices.

constdeparatment=["biology","mathematics","electical-engineering",  ...]constroom=["k21","k12","w4",  ...]// etc.

then each candidate can be a Uint array[depIdx, roomIdx, ...].

A different approach you could take is devote 2 genes toroom and letthe first be the ASCII code of the room (a..z) and the second roomnumber (1..100 or something).

Dtype

You need to setdtype yourself depending on the problem domain.

data typetyped arraymin valuemax value
"u8"UInt8Array028
"u16"UInt16Array0216
"u32"UInt32Array0232
"i8"Int8Array-28 - 128 - 1
"i16"Int16Array-216 - 1216 - 1
"i32"Int32Array-232 - 1232 - 1
"f32"Float32Array (32-bit IEEE float)1.2 * 10-383.4 * 1038
"f64"Float64Array (64-bit IEEE float)5.0 * 10-3241.8 * 10308

You benefita lot from restricting the search space by choosing e.g.i8 as opposed toi16.

[OPTIONAL] Defaultopts

In addition to required parameters (fitnessFunction,nGenes,dtype), you can also supply an object with configuration. I encourageto begin with defaults and then tweak if necessary. Here are thedefaults:

const{ Duration, PopSize, NElite, NRounds, LogLvl}=require('genetic-algo');constopts={logLvl:LogLvl.SILENT,// stop condition//// if you find that the algorithm gets stuck too quickly, increase ittimeOutMS:Duration.seconds(30),// stop conditionnRounds:NRounds.LARGE,/* 1000000 */// how many candidate solutions to keep track of//// it makes sense for it to be 100 - 1500 ish//// if you find that the algorithm gets stuck too quickly, increase itpopSize:PopSize.MEDIUM/* 300 */,// number of elite candidates (guaranteed to make it to next gene pool unaltered)//// if you find that the algorithm gets stuck too quickly, decrease it//// e.g. nElite: NElite.SMALL,// e.g. nElite: NElite.MEDIUM,// e.g. nElite: NElite.LARGE,// e.g. nElite: 0.1// e.g. nElite: [0.01, 0.1]// e.g. nElite: { start:  0.1, end:  0.5, whenFit: 'constant' }// e.g. nElite: { start: 0.01, end: 0.25, whenFit: 'increases' }// e.g. nElite: { start:  0.1, end:  0.5, whenFit: 'decreases' }nElite:NElite.ADAPTIVE,/* { start:  0.05, end: 0.15 } */// probability of mutation//// e.g. pMutate: PMutate.SMALL,// e.g. pMutate: PMutate.MEDIUM,// e.g. pMutate: PMutate.LARGE,// e.g. pMutate: 0.1// e.g. pMutate: [0.01, 0.1]// e.g. pMutate: { start:  0.1, end:  0.5, whenFit: 'constant' }// e.g. pMutate: { start: 0.01, end: 0.25, whenFit: 'increases' }// e.g. pMutate: { start:  0.1, end:  0.5, whenFit: 'decreases' }pMutate:PMutate.ADAPTIVE,/* { start:  0.1, end: 0.01, whenFit: 'increases' } */// when mutating, target at least ? genes//// e.g. nMutations: NMutations.SMALL,// e.g. nMutations: NMutations.MEDIUM,// e.g. nMutations: NMutations.LARGE,// e.g. nMutations: 0.1// e.g. nMutations: [0.01, 0.1]// e.g. nMutations: { start:  0.1, end:  0.5, whenFit: 'constant' }// e.g. nMutations: { start: 0.01, end: 0.25, whenFit: 'increases' }// e.g. nMutations: { start:  0.1, end:  0.5, whenFit: 'decreases' }nMutations:NMutations.ADAPTIVE,/* { start: 10, end: 1, whenFit: 'decreases' } */// when mutating, the value of a gene is replaced with a random value// this specifies the range of the random value//// when not specified, it's set intelligently based on dtype so not necessary to tweak//// set it manually if you have an idea of where in the search space the solution might be// this might cause it to converge faster//// e.g. randGeneVal: [-100, 2000] /* lower and upper bounds */// e.g. randGeneVal: () => -200 + Math.random() * 1E4 /* custom function */randGeneVal:undefined,// when using multi-objective optimisation, you can specify relative weights for every objective// (measured by each of fitness function from the array)// see **FITNESS FUNCTION**// e.g. weights: [0.2, 0.4, 1] /* needs to have the same length as the fitenss function array */weights:undefined,}

For example:

constopts={timeOutMS:Duration.seconds(30),nElite:0.1,}constnGenes=1000constdtype='u32'constga=newGA(someFitnessFunct,nGenes,dtype,opts)

Theory Behind Genetic Algorithms

Genetic algorithmssimulate the process of evolution. You are theone specifying what each candidate should be good at to survive (fitnessfunction).

This algorithm uses a nature-inspiredheuristic and has thepotential to achieve excellent results but itmight not find theoptimal (ideal) solution. That said, for many applications the bestsolution is not needed. By sacrificing a bit of quality you drasticallyreduce the time needed to find a solution. Without such heuristics someproblems cannot be solved at all. These would NP complete problems towhich we do not have an algorithm which would run in polynomial time.

Candidate

This is your "DNA" which represents acomplete solution to theproblem you are trying to solve. The algorithm keeps track of apopulation of those DNA strings. Candidates are modified in such a waythat the population approaches a solution. In this implementationcandidate solutions (chromosomes) are typed arrays. Depending on whattype of problem you are trying to solve you will use a differentdtype. Each candidate corresponds to a point in the search space thatyou are exploring.

Fitness Function

Measures the value of a candidate solution. The algorithm will perform wellif your fitness function is good.

Crossover

One of the two ways candidate solutions are modified. Crossover is allaboutrecombination. It is abinary operation that takes twocandidates and selects a portion of genes from one parent and the restfrom the other.

In an ideal scenario, you would inherit genes from fit individuals.However, if you do that too often, you will loose novelty and you thealgorithm will get stuck very quickly. You can change how often fittestcandidates (elite) are chosen by changingminPElite andmaxPElite.

NOTEnElite needs to be non-zero for this to work.

Mutations

One of the two ways candidate solutions are modified. This is aunaryoperation. It takes a single candidate andrandomly alters a singlegene. Mutations introducenovelty. If your algorithm gets stuck tooquickly it's because there was not enough novelty. In an ideal scenario,fittest candidates would undergo mutation whereas the least fit would usecrossover. Furthermore, ideally, the algorithm would explore the fitnesslandscape more at the beginning and then exploit the discovered peaks at theend of running the algorithm. This implementation does both for you automatically.

Population

Population is a collection of candidate solutions. An initial population withpopSize = 5,nGenes = 2,dtype = 'u8' might look something like this:

// gene1 gene2[23,0]// candidate 1[1,41]// candidate 2[10,1]// candidate 3[1,100]// candidate 4[0,222]// candidate 5

Profiling with EventEmitter API

TheGeneticAlgorithm emits signals along with some informationwhich can be used for profiling.

NOTE data emitted is in sub-bullets.

Emitted Once

  1. "start" after.search() and all initialisation is complete, before the 1st round

Emitted on Stop Condition Met

  1. "timeout" whentimeOutMS limit is reached.
  2. "stuck" when stuck in a local minimum.
  3. "rounds" whennRounds limit reached.
  4. "end" when finished.

Emitted Every Round

  1. "round" on every round start (not the same as"rounds").
  2. "op" on every selection and mutation / crossover operation application

Example of extracting data from signals:

ga.on('start',()=>console.log('[START] with opts',opts))ga.on('stuck',()=>console.log(`[END] stuck`))ga.on('timeout',()=>console.log(`[END] timeout`))ga.on('end',()=>console.log(`[END] after round #${ga.rIdx} (took${ga.timeTakenMS/SEC}sec)`))

More exampleshere.

Performance

  • The bottleneck is the fitness function.
  • Log less for better performance

Downsides

  • single-threaded (but seeparallel example that uses the cluster module from node stdlib).

License

MIT

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp