Understanding the two alchemical legs and primary methods (FEP, TI) required for rigorous Relative Binding Free Energy (RBFE) calculations.
Relative Binding Free Energy (RBFE) calculations are a cornerstone of modern computational drug discovery. They aim to predict how a small chemical modification—for example, changing one substituent on a ligand—affects binding affinity to a protein target. Rather than computing absolute binding free energies independently for two ligands, RBFE focuses on the difference:
\[\Delta\Delta G_{\text{binding}} = \Delta G_{\text{bind}}^B - \Delta G_{\text{bind}}^A\]This relative quantity is typically much more accurate and computationally efficient, because the two ligands are chemically similar and share much of their phase space.
To compute this quantity rigorously, RBFE relies on alchemical transformations that close a thermodynamic cycle.
Alchemical free energy methods connect two physical states—such as Ligand A and Ligand B—through a sequence of non-physical intermediate states. These intermediate states do not correspond to realizable chemical systems; instead, they are constructed by smoothly modifying the Hamiltonian.
From a statistical mechanics perspective, the key idea is phase-space overlap. By introducing intermediate alchemical states, we ensure that neighboring ensembles share sufficient overlap, allowing free energy estimators (e.g., BAR and MBAR) to operate efficiently and with low bias.
In short:
Alchemy works because it replaces a difficult, discontinuous transformation with a smooth, overlap-preserving path.
RBFE calculations rely on a thermodynamic cycle that decomposes binding into two alchemical transformations performed in different environments.
The cycle consists of two alchemical legs, both transforming Ligand A into Ligand B:
- Complex Leg $\Delta G_{\text{complex}}$
- Solvation Leg $\Delta G_{\text{solvent}}$
The final Relative Binding Free Energy is the difference between these two alchemical transformations:
\[\Delta\Delta G_{\text{binding}} = \Delta G_{\text{complex}} - \Delta G_{\text{solvation}}\]Because free energy is a state function, the result is independent of the non-physical alchemical path used.
In practice, we often compute the alchemical change in both directions:
If the simulations are well-converged and there is sufficient phase-space overlap between neighboring $\lambda$ windows, the two estimates should agree:
\[\Delta G_{A\to B} \approx -\Delta G_{B\to A}\]The difference between them is a useful diagnostic:
A large $\Delta G_{\text{hyst}}$ typically indicates insufficient sampling or poor overlap (often near endpoints, especially for VDW decoupling without enough soft-core / $\lambda$ windows).
For two adjacent $\lambda$ states $i$ and $j$, BAR combines samples from both ensembles:
\[\Delta G_{i\to j} = \text{BAR}(U_i, U_j)\]Intuitively:
When overlap is good, BAR/MBAR are very stable; when overlap is poor, they will show large uncertainty or inconsistent forward/reverse behavior—telling you exactly where to add more sampling or more $\lambda$ windows.
Each leg of the thermodynamic cycle requires a statistically rigorous method to compute the free energy change along the alchemical path.
FEP is based on the Zwanzig relation, which estimates the free energy difference between two states from an exponential average of energy differences:
\[\Delta G = -kTln(e^{-\beta\Delta U})\]In modern workflows, FEP is rarely applied as a single step. Instead, the transformation is broken into discrete λ windows, and the resulting data are analyzed using BAR or MBAR. This approach is the default in contemporary packages such as OpenFE, Perses, and Schrödinger FEP+.
TI computes free energy differences by integrating the ensemble-averaged derivative of the Hamiltonian:
\[\Delta G = \int_{0}^{1} \left\langle \frac{\partial H}{\partial \lambda} \right\rangle_{\lambda} \, d\lambda\]TI requires smooth behavior of ⟨∂H/∂λ⟩ and sufficiently dense λ sampling. While statistically rigorous, it is often less efficient than BAR/MBAR-based FEP for high-throughput RBFE studies.
Alchemical transformations are implemented by scaling interaction terms in the Hamiltonian using a coupling parameter λ. Importantly, λ is not a physical reaction coordinate, but a bookkeeping device that defines a family of Hamiltonians.
Although RBFE uses relative transformations, it is helpful to understand decoupling via the absolute binding free energy (ABFE) framework.
The double-decoupling method expresses binding as:
\[\Delta G_{\text{bind}} = \Delta G_{\text{decouple}}^{\text{protein}} - \Delta G_{\text{decouple}}^{\text{solvent}}\]Decoupling proceeds by gradually eliminating ligand–environment interactions until the ligand becomes a non-interacting “ghost” particle.
To avoid numerical instabilities, interactions are typically removed in stages:
| State | Physical Process | Alchemical Process (Decoupling) |
|---|---|---|
| Bound | Ligand + Protein ↔ Complex | $\Delta G_{\text{decouple}}^{\text{protein}}$:Turn off the ligand’s interactions with its environment (protein and solvent) in the binding pocket. |
| Unbound (Solvent) | Ligand (gas) ↔ Ligand (solvent) | $\Delta G_{\text{decouple}}^{\text{solvent}}$: Turn off the ligand’s interactions with the bulk solvent. |
In alchemical simulations, the coupling parameter λ defines a continuum of Hamiltonians between two endstates. Rather than being a physical reaction coordinate, λ scales interaction terms in the potential energy function so that ligand A interactions are gradually reduced while ligand B interactions are gradually increased. Because it alters how strongly atoms interact with their environment (e.g., electrostatics and van der Waals), λ creates a smooth path through a series of overlapping ensembles. This is why MBAR and BAR can reliably estimate free energy differences — they exploit the overlap across these discretized λ windows.
\[\lambda = \{ \lambda_0, \lambda_1, \lambda_2, \ldots, \lambda_N \}\]The total free energy change $\Delta G$ is the sum of the changes between adjacent windows:
\[\Delta G = \sum_{i=0}^{N-1} \Delta G(\lambda_i \to \lambda_{i+1})\]Accurate calculation requires that the phase-space overlap between consecutive windows is sufficient. If the $\lambda$ values are too far apart, the overlap is poor, leading to large statistical errors and unconverged results.
Choosing an effective λ schedule
| Area of Transformation | Optimal Schedule | Rationale |
|---|---|---|
| Electrostatics | Generally linear or evenly spaced $\lambda$ values. | The free energy profile with respect to $\lambda$ is relatively smooth. |
| Van der Waals (VDW) | Non-linear or unevenly spaced $\lambda$ values. | The VDW coupling term $\langle \partial U / \partial \lambda \rangle$ peaks sharply near the endpoints ($\lambda \approx 0$ and $\lambda \approx 1$). More $\lambda$ points are needed near the ends to capture these non-linear changes. |
| Binding Pocket | More $\lambda$ points are needed for the $\Delta G_{\text{decouple}}^{\text{protein}}$ arm. | The ligand’s movement in the restrictive, heterogeneous environment of the protein pocket requires more sampling to converge compared to the bulk solvent. |
Compute RBFE: The difference between complex and solvent free energies yields the relative binding free energy:
\[\Delta\Delta G_{\text{binding}} = \Delta G_{\text{complex}} - \Delta G_{\text{solvent}}\]Example
λ: 0.0 0.3 0.7 1.0
|----------|----------|----------|
Phase I Phase II Phase III
Phase I : Turn OFF electrostatics of A
Phase II : Swap VDW of A ↔ B (soft-core)
Phase III : Turn ON electrostatics of B
flowchart TB
L0["λ = 0.0\nLigand A fully interacting"]
E1["Electrostatics decoupling\nCharges of A → 0"]
V1["VDW transformation\nSoft-core A → B"]
E2["Electrostatics recoupling\nCharges of B → full"]
L1["λ = 1.0\nLigand B fully interacting"]
L0 --> E1 --> V1 --> E2 --> L1
λ = 0 λ = 1
|-- elec off --|====== VDW soft-core ======|-- elec on --|
sparse DENSE DENSE DENSE sparse
Here are some more articles you might like to read next: