Pipeline architecture

LaREST chains four computational stages to produce thermodynamic parameters (H, S, G) for lactone ring-opening polymerization reactions. Each stage feeds into the next, progressively increasing the level of theory.

Stages

1. RDKit + xTB

Purpose: Generate a set of initial molecular conformers and rank them by free energy.

  1. RDKit generates MMFF conformers (default: 50 per molecule).

  2. Each conformer is optimised at the xTB GFN2 level and a Hessian is computed to obtain H, S, and G.

  3. The lowest-free-energy conformer is passed to the next stage.

Controlled by [rdkit] and [xtb] in config.toml. Set steps.rdkit = false to skip.

2. CREST conformer ensemble

Purpose: Explore the conformational space more thoroughly using metadynamics.

  1. CREST’s iMTD-GC algorithm generates a conformer/rotamer ensemble from the best RDKit conformer.

  2. The ensemble is deduplicated and sorted by energy (CREGEN).

  3. xTB re-ranks the ensemble by free energy; the lowest-energy conformer is passed forward.

Controlled by [crest.confgen] in config.toml. Set steps.crest_confgen = false to skip.

3. CENSO + ORCA (DFT refinement)

Purpose: Refine the CREST ensemble with density functional theory using four sub-stages of increasing accuracy.

Sub-stage

Label

Default functional

Default basis

Prescreening

censo_prescreening

PBE-D4

def2-SV(P)

Screening

censo_screening

r2SCAN-3c

def2-TZVP

Optimisation

censo_optimization

r2SCAN-3c

def2-TZVP

Refinement

censo_refinement

wB97X-V

def2-TZVP

Each sub-stage applies an energy window threshold to prune the ensemble before passing it to the next stage. ORCA is used as the QM backend throughout.

Controlled by [censo.*] in config.toml. Set steps.censo = false to skip.

4. CREST entropy

Purpose: Compute the conformational entropy correction using CREST’s entropy mode.

  1. CREST re-explores the conformational space using GFN-FF (fast) to obtain a well-converged entropy estimate.

  2. The resulting S_conf is added to the CENSO refinement results to produce the censo_corrected section.

Controlled by [crest.entropy] in config.toml. Set steps.crest_entropy = false to skip.

Checkpointing

At the start of each molecule’s run, LaREST walks through the output directory and identifies the first missing result file. All stages up to that point are skipped; execution resumes from there. This means interrupted runs can be restarted without any manual intervention.

Molecules processed

For each monomer SMILES in the config, LaREST runs the full pipeline for:

  1. The monomer itself.

  2. The initiator (ROR only).

  3. Each polymer at every requested chain length.

Polymer SMILES are constructed automatically from the monomer and chain length. Final reaction thermodynamics (ΔH, ΔS, ΔG) are computed in compile_results and written to summary/ CSVs.