Modular EvoMD implementation in python

EvoMD is an evolutionary optimization framework for peptide sequences. It evolves a population of peptides based on a user-defined fitness function evaluated via simulation modules written and plugged in by user. The configuration is stored in a single YAML file, and the state is serialized to evolver.pkl after every step.

The general process begins with a random population that is continuously evaluated and sorted. The worst performers are discarded, and the population is renewed through crossover of the best sequences. The result is an optimized population of sequences.

Dependencies

Python 3.10+ required.

pip install numpy pyyaml matplotlib

Quickstart

This example uses test_hm.py (included), which maximizes the hydrophobic moment of 20-residue peptides with no real simulation.

1. Create an input.yaml:

optimize: maximize  # maximize or minimize
population: 32  # how many sequences in the population?
peptide_len: 20  # peptide lenght
populate_method: swap  # hybrids and swap work better
extra_mutation: True  # Random point mutation
also_mutate_probability: 0.1  # 10% of the new sequences are mutated
parents_ratio: 0.25  # Parents are chosen from the 25% of the population

# User must write the name of the python script with the external methods.
# These are external methods.
constructor: test_hm  # script for contructing simulation box.
calculator:  test_hm  # running and checking simulations.
analyzer:    test_hm  # Analysis and computation of fitness values.

sleep_time: 0
max_check_cycle: 50
max_generations: 50

hydrophobic_restriction: False  # change to True for 
hydrophobic_min: 5.5   # ignored while hydrophobic_restriction is False
hydrophobic_max: None  # no max limit

2. Create the Evolver:

python evo-md.py --file input.yaml --create-evolver

3. Run the evolution:

python evo-md.py --file input.yaml --start

4. Inspect the results:

python evo-md.py --show-evolver
python evo-md.py --report-sequences            # writes sequences_report.csv
python evo-md.py --plot-evolution --show-kids  # plots and saves as evolution.png

How it works

Each generation EvoMD:

Constructs the simulation box (constructor_method).
Runs the simulation (calculator_method).
Checks if simulations finished (calculator_check).
Analyzes results and computes fitness (analyzer_method).
Sorts sequences, keeps the best as parents, and generates the next generation.

Users have the control of steps 1–4 by writing a Python plug-in with four functions and calling it in the YAML (see next section). EvoMD handles the rest.

The general idea of the modular architecture is shown in the next figure:

Writing your own methods

Users must create Python modules with these four functions and name it in the YAML (constructor, calculator, analyzer). All three keys can point to the same file. calculator_method and calculator_check must be in the same python file.

def constructor_method(sequence) -> None:
    # Write input files, build the system, etc.
    pass

def calculator_method(sequence) -> None:
    # Launch the job (submit, start a process, etc.)
    pass

def calculator_check(sequence) -> bool:
    # Return True when the job is done, False otherwise.
    # Polled repeatedly by EvoMD until True or the cycle budget runs out.
    return True

def analyzer_method(sequence) -> float:
    # Parse results and return the fitness value.
    return 0.0

Each function runs inside the sequence's iteration directory. The sequence object gives the residue string (str(sequence)), physicochemical properties (sequence.charge, sequence.residues, …), and state flags (sequence.is_failed to mark a failure). You can attach arbitrary attributes to carry information between functions (e.g. a job ID set in calculator_method and read in calculator_check).

See test_hm.py for a short example.

Common commands

What you want to do	Command
Create a new Evolver and exit	`python evo-md.py --file input.yaml --create-evolver`
Start the evolution loop	`python evo-md.py --file input.yaml --start`
Resume an interrupted iteration	`python evo-md.py --restart`
Stop after the current iteration	`python evo-md.py --stop-evolver`
Show the current state	`python evo-md.py --show-evolver`
Export all sequences to CSV	`python evo-md.py --report-sequences`
Plot fitness over generations	`python evo-md.py --plot-evolution`
Roll back to the last complete generation	`python evo-md.py --last-generation`
Show the loaded configuration	`python evo-md.py --show-current`

Run python evo-md.py --help for the full flag reference.

Output files

File	Description
`evolver.pkl`	Full Evolver state, updated after each step.
`sequences_report.csv`	All sequences with generation and fitness.
`evolution.png`	Fitness plot (written by `--plot-evolution`).
`simulation_data/<SEQUENCE>/iter_N/`	Per-sequence, per-attempt directories where your methods run.

Recovery

If a run is interrupted:

python evo-md.py --restart

If the last generation is incomplete or just want to go back to the last completed generation:

python evo-md.py --last-generation

Then continue with --start.

Update version

Use the old version to create a sequence report in CSV format.

python scripts_old/evo-md.py --report-sequences

Update input.yaml with new variable names (some variables were refactores for better comprehention).

Create evolver.pkl and read report with the new version.

python scripts_new/evo-md.py --create-evolver --file input.yaml --read-report sequence_report.csv

Start evolution with the new version

python scripts_new/evo-md.py --start

Running on LUMI

You can run evo-md using a LUMI container wrapper. The next instructions were adapted from LUMI Documentation. https://docs.lumi-supercomputer.eu/software/installing/container-wrapper/

Load the LUMI container module.

module load LUMI
module load lumi-container-wrapper

Create a conda environment file (e.g. env.yml).

# --- env.yml ---
channels:
  - conda-forge

dependencies:
  # Dependencies for evo-md
  - python>=3.10
  - numpy
  - pyyaml
  - matplotlib

  # Include dependencies for your
  # external methods (constructor, calculator, analyzer)
  # e.g. scipy, pymol, vermouth, biopython, etc.
  - pymol-open-source
  - vermouth

Create the directory for the container (e.g. env/) and create the container.

mkdir env
conda-containerize new --prefix env env.yml

Execute python from the container.

/users/USER/env/bin/python /users/USER/modular_evomd/evo-md.py --help

Populating from backup

You can populate Evolver from the json files created after each generation. First step is to create a new evolver using an input file with an adequate configuration (be sure that peptide_len is equal to the length of the sequences in the backup).

Then populate from backup:

python evo-md.py --from-backup

This creates sequences from json files in simulation directory and sort the sequences. The result is an evolver with choosen parents ready to start.

python evo-md.py --start

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
images		images
other_codes		other_codes
tests		tests
README.md		README.md
directed_mutations.py		directed_mutations.py
evo-md.py		evo-md.py
evolver.py		evolver.py
faces_mix.py		faces_mix.py
flanks.py		flanks.py
generator.py		generator.py
genmethod.py		genmethod.py
hybrid.py		hybrid.py
instruction_fields.py		instruction_fields.py
instruction_validators.py		instruction_validators.py
instructor.py		instructor.py
intervals.py		intervals.py
manager.py		manager.py
parserlib.py		parserlib.py
pattern.py		pattern.py
peptide_viewer.py		peptide_viewer.py
radial_sequence.py		radial_sequence.py
residue.py		residue.py
restriction.py		restriction.py
scales.py		scales.py
sequence.py		sequence.py
sequence_geometry.py		sequence_geometry.py
swap.py		swap.py
test.py		test.py
test_generator.py		test_generator.py
test_hm.py		test_hm.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Modular EvoMD implementation in python

Dependencies

Quickstart

How it works

Writing your own methods

Common commands

Output files

Recovery

Update version

Running on LUMI

Populating from backup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Modular EvoMD implementation in python

Dependencies

Quickstart

How it works

Writing your own methods

Common commands

Output files

Recovery

Update version

Running on LUMI

Populating from backup

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages