Open source by Santander AI Lab. A dependency-free Python genetic-algorithm library / engine with pluggable fitness criteria — the reusable search core for building an LLM / AI autoresearcher (generate → evaluate → select → repeat).
Part of Santander AI Open Source — open source AI projects from Banco Santander (santander.com).
genetic-algorithm is a tiny evolutionary engine — population, selection,
crossover, mutation — whose fitness criterion is a swappable plugin. The
engine never hard-codes what "better" means; it only ever asks a plugin for a
single number. That one design choice is what turns a textbook GA into the
search core of an autoresearcher.
A simple autoresearcher — the kind of loop Andrej Karpathy describes: generate a hypothesis → test it → measure it → keep the best → repeat — is, structurally, an evolutionary loop. A genetic algorithm with pluggable criteria gives you exactly that machinery, ready-made, so you don't reimplement it every time.
| Autoresearcher | Genetic algorithm |
|---|---|
| The candidates you explore (prompts, configs, hypotheses, strategies, code snippets) | Population |
| "Is this one better?" | Fitness — and this is where your plugins live |
| How the next batch of candidates is produced from the good ones | Mutation / crossover |
| Keep what works, drop what doesn't | Selection |
A naïve autoresearcher does a greedy or random search. The GA adds structured selective pressure plus diversity — precisely what stops a self-reinforcing LLM loop from collapsing into a single chain of reasoning and getting stuck in a local optimum.
The bottleneck of any autoresearcher is not the loop — it's defining "better" well. Making fitness a plugin is what makes that tractable:
- Separate the engine from the judgment. The engine doesn't need to know what you optimise; it needs a number. The plugin encapsulates the domain — summary quality, pipeline latency, test coverage, an experiment's score.
- Compose multiple objectives. Different plugins = different criteria you can combine ("maximise quality and minimise token cost").
- Change problem without touching the core. The same engine tunes prompts today and pipeline configs tomorrow — you only swap the plugin.
- Put an LLM inside the criterion. A plugin can be an LLM-as-a-judge that scores qualitative candidates where there is no obvious numeric metric.
See examples/autoresearcher.py for the mapping
made concrete, including the exact seam where a real LLM judge plugs in.
This is engineering, not a silver bullet:
- GA is evaluation-hungry. Every individual needs a fitness value. If evaluating means an LLM call or a real experiment, a population of 50 over 100 generations is 5,000 calls — often prohibitive. For many problems a hill-climber + LLM mutator is cheaper and nearly as good.
- The GA is only as good as the fitness plugin. A badly designed criterion invites reward hacking: the loop finds candidates that score high and are junk. The plugin is the main point of failure, not the algorithm.
- For small or convex search spaces, a GA is over-engineering. Exhaustive or Bayesian search will win there.
Where the tool beats "just a bare loop":
- The bare loop is the proof of concept; GA + plugins is the productisable, reusable version. You turn an ad-hoc script into infrastructure: a stable evolutionary engine with interchangeable criteria.
- Diversity for free. The GA keeps several lines of investigation alive in parallel instead of one fragile chain of thought.
- Traceability. Every generation is an auditable record of what was tried and why it survived — which fits a need for verifiable work.
pip install -e .The engine itself has no third-party runtime dependencies — it runs on the Python standard library alone.
from genetic_algorithm import Population, register_fitness
# 1. Define the criterion — this is your plugin. Higher is better.
@register_fitness("max_ones")
def max_ones(genes):
return float(sum(genes))
# 2. Hand the engine a population shape, bounds, and the plugin.
pop = Population(
pop_size=30,
chromosome_size=8,
bounds=[(0, 1) for _ in range(8)],
fitness_fn=max_ones,
elitism=True,
seed=36,
)
# 3. Run the loop: generate -> evaluate -> select -> recombine -> mutate.
for _ in range(25):
pop.calculate_fitness()
best = pop.best_in_generation(1)[0]
pop.selection(method="roulette")
pop.crossover(method="k_points", k=2)
pop.mutation(method="probability_mutation")
print(best.data, best.fitness)Any callable Sequence[float] -> float (higher is better) is a valid criterion,
so you can pass a function directly or register it by name for
configuration-driven runs (get_fitness("max_ones")).
# Numeric optimisation toward a target vector
python -m examples.optimize_sphere
# The GA framed as a Karpathy-style autoresearcher (offline LLM-as-judge stub)
python -m examples.autoresearcherPopulation— the evolutionary engine. Selection (roulette,elitist), crossover (single_point,k_points), mutation (probability_mutation,twors,cim,thrors), optional elitism, optional multi-threaded fitness evaluation (useful when the criterion is I/O-bound, e.g. an LLM judge), and aseedfor reproducible runs.Chromosome— a single candidate: a list of float genes within per-gene bounds.FitnessFunction— the plugin contract (Sequence[float] -> float).register_fitness/get_fitness/available_fitness— a small registry for selecting criteria by name.genetic_algorithm.plugins— reference criteria:max_value,negative_sphere, and thetarget_vector/weighted_sumfactories.
- Python 3.10+
- No third-party runtime dependencies — standard library only.
- Optional, for development only:
ruff,black,mypy,pytest,pytest-cov(see CONTRIBUTING.md).
Contributions are welcome! Please read our Contributing Guidelines and Code of Conduct before getting started.
- Report bugs and request features via GitHub Issues.
- External contributors sign the CLA (handled automatically by the CLA Assistant bot on your first PR).
- Run
ruff check .,black --check .,mypy genetic_algorithm, andpytestbefore opening a PR. - Keep the engine dependency-free (standard library only).
Please report security vulnerabilities responsibly. See our Security Policy for how to report (do not open a public issue for vulnerabilities). Contact: security-opensource@gruposantander.com or use GitHub Security Advisories.
This project is licensed under the Apache License 2.0 — see the LICENSE and NOTICE files for details.
Copyright (c) 2026 Santander Group
SPDX-License-Identifier: Apache-2.0
If you use genetic-algorithm in your research, please cite it:
@software{geneticalgorithm2026,
author = {{Santander AI Lab}},
title = {genetic-algorithm: a pluggable-fitness evolutionary engine},
year = {2026},
url = {https://github.com/SantanderAI/genetic-algorithm},
license = {Apache-2.0}
}