Training Fixed-Point SCF Models
The main training entry point is scripts/run_train.py. Datasets, losses, and
training stages are specified in config.yaml; see
data and config files.
Fixed-point training has two layers of configuration:
architecture hyperparameters, which define the density representation, field features, fixed-point update rule, and energy readouts;
training hyperparameters, which define how the model is trained in each schedule stage.
Before setting up a training schedule, read using fixed-point SCF models. This page does not repeat those SCF definitions.
Architecture Hyperparameters
Architecture hyperparameters are command-line arguments to
scripts/run_train.py. They define the model and should usually be fixed
across all stages of a training schedule.
Density Representation
The fixed-point model represents charge density with atom-centred Gaussian multipoles. The main density and electrostatics arguments are:
--atomic_multipoles_max_l--atomic_multipoles_smearing_width--kspace_cutoff_factor--electrostatic_pbc_method--include_electrostatic_self_interaction
See atomic multipoles and boundary conditions.
Field Features
The fixed-point update uses atom-centred electrostatic field features to describe \(\mathbf{v}_{\text{eff}}(\mathbf{r})\) rather than the raw potential. The main field-feature arguments are:
--field_feature_max_lmaximum angular order;--field_feature_widthsGaussian widths used to sample the potential, for example"[1.5, 3.0]";--field_feature_norms"None","average", or an explicit list with lengthlen(field_feature_widths) * (field_feature_max_l + 1);--include_field_siinclude self-interaction-like terms in the field-feature path;
Example:
--field_feature_max_l=1 \
--field_feature_widths="[1.5, 3.0]" \
--field_feature_norms="[25.0,25.0,1.0,1.0]" \
For field_feature_max_l=1 and two widths, the explicit normalization list has
four entries: two for l=0 and two for l=1.
With:
--field_feature_norms=average
the training setup estimates one RMS field-feature scale for each
(angular order, width) pair from the training set. For example, with
field_feature_max_l=1 and two widths, it returns four normalizers in the same
order expected by the explicit list.
Average norm estimation uses the same electrostatic feature machinery as
FixedPointCore. If your data doesn’t have any atomic_multipoles, this functionality will not work.
Fixed-Point Update Rule
The update rule maps local geometry features and electrostatic field features
to a field-dependent density correction. It is configured with
--fixedpoint_update_config, which can be set in the config.yaml.
Default:
fixedpoint_update_config:
type: OneBodyVariableUpdate
potential_embedding_cls: BiasedLinearPotentialEmbedding
nonlinearity_cls: NoNonLinearity
You should choose type from:
OneBodyVariableUpdate
ManyBodyUpdate
The manybody update is greatly more expensive, since it needs to be iterated as the SCF converges. We generally haven’t seem substantial improvements with the many body update, and it is mostly useful for experimentation.
The nonlinearity_cls can currently take valuees:
“NoNonLinearity”
“MLPNonLinearity”
One-body linear updates are the simplest option. Nonlinear and many-body-style updates add flexibility at higher complexity.
The optional --use_linear_local_charges flag switches the field-independent
local density readout to a scaled linear local-source block.
--atom_density_scaling provides the species-dependent scaling for that path.
--atom_density_scaling accepts:
"None"use a scale of1.0for every element;an explicit dictionary, for example
"{1: 0.5, 8: 2.0}";"average"compute one species scale from the training-set reference atomic multipoles.
The "average" mode computes the RMS over all selected multipole components
up to atomic_multipoles_max_l, separately for each element, and returns the
scales in the model’s atomic-number table order. Configs without reference
atomic_multipoles are excluded from the average using the stored property
weights, so missing multipoles do not lower the scale by entering as zeros.
Nonlocal Energy Term
The fixed-point density is not obtained by minimizing a variational energy
functional. The model therefore has a separate density-dependent energy
readout, called local_electron_energy in FixedPointCore.
This term is configured with --field_readout_config.
Default:
{"type": "StrictQuadraticFieldEnergyReadout"}
Example:
--field_readout_config "{'type': 'StrictQuadraticFieldEnergyReadout'}"
Registered readout names include:
NullFieldReadoutStrictQuadraticFieldEnergyReadoutOneBodyMLPFieldReadoutManyBodyChargesReadoutManyBodyChargesFieldReadout
Training Method
The training methods and fixed-point formulation are described in the fixed-point paper.
Current recommended starting schedule:
train with direct training,
mode: direct, for about 100 epochs;continue with
mode: unroll_scf, using about 10 SCF steps andmixing_parameter: 0.3.
The first stage trains the response map at the reference density. The second stage trains the model closer to inference, where it must run its own SCF loop.
Stage Options
Each train_schedule stage can include a fixed_point_training_options block:
fixed_point_training_options:
mode: unroll_scf
scf:
num_scf_steps: 10
constant_charge: true
mixing_parameter: 0.3
initial_density: from_data
initial_fermi_level: from_data
use_autograd_forces: true
The training keys are:
mode. This can take valuesdirectunroll_scfimplicitlinearize_solve
scfnested SCF settings used byunroll_scfandimplicit.
The nested scf keys are:
num_scf_stepsscf_toleranceconstant_chargemixing_parameterinitial_densityinitial_fermi_leveluse_autograd_forces
mode: direct does not run an SCF loop and should not include an scf block.
For mode: unroll_scf and mode: implicit, the scf block must explicitly
set num_scf_steps and mixing_parameter.
See using fixed-point SCF models for the meaning of these SCF settings. Training-specific notes are given below.
Other training arguments, such as --field_block_weight_decay and
--local_charges_weight_decay, control optimizer parameter groups.
Example Schedule
train_schedule:
0:
name: direct
start: 0
end: 99
loss:
atomic_multipoles: 100.0
total_charge_per_atom: 1000.0
dipole_per_atom: 1000000.0
energy_per_atom: 10.0
forces: 100.0
lr: 0.01
fixed_point_training_options:
mode: direct
1:
name: unroll
start: 100
end: 110
loss:
atomic_multipoles: 100.0
total_charge_per_atom: 1000.0
dipole_per_atom: 1000000.0
energy_per_atom: 1000.0
forces: 100.0
lr: 0.001
fixed_point_training_options:
mode: unroll_scf
scf:
num_scf_steps: 10
constant_charge: true
mixing_parameter: 0.3
initial_density: from_data
initial_fermi_level: from_data
use_autograd_forces: true
Use this as a starting point. Adjust losses, learning rates, and SCF settings for the dataset and model stability.
Training Modes
Direct Training
mode: direct
This is direct training in the terminology of the paper. The model is not
iterated to self-consistency. The wrapper uses reference atomic_multipoles
and fermi_level from the data, applies one update, and compares the result to
the reference density and other targets.
This mode is computationally inexpensive and stable, making it useful at the start of training. Its main limitation is the mismatch with inference: a model can fit the direct objective but still be unstable when run as an SCF model.
Unrolled SCF Training
Repo name:
mode: unroll_scf
This mode runs the SCF loop during training and differentiates through the unrolled iterations. The loss is applied to the model’s own SCF result.
The cost grows with num_scf_steps, and long unrolls can be memory-intensive.
The current practical continuation stage is about 10 steps with
mixing_parameter: 0.3.
initial_density: from_data and initial_fermi_level: from_data are useful
when moving from direct training to unrolled SCF. local_guess and zero are
more inference-like tests.
Implicit Differentiation
Repo name:
mode: implicit
Implicit differentiation treats the converged SCF density as the solution of a fixed-point equation, rather than as the output of a fixed number of unrolled iterations. Conceptually, gradients are computed while enforcing the condition that the fixed-point equation remains solved. Please see implicit_diff for details.
This method requires the torchopt package, which can be difficult to install and get working. Please ensure you run the tests (tests/test_fixed_point_wrapper_training_modes.py) after installing torchopt to check you are getting the right numbers.
Linearize and Solve
Repo name:
mode: linearize_solve
The linearize_solve method is equilivant to implicit differentiation, but we find it to be generally better behaved. This method works by iterating to the solution during in training, then, at the solution, the update rule is linearized and the output predicted by solving the resulting linear problem. This is equivalent to linearizing at the solition to create a quadratic QEq or linear polarizable force type model, which can then be solved by matrix inversion. Please see implicit_diff for details.
This method does not require torchopt or any other dependencies.
Losses
Common fixed-point losses include:
atomic_multipolestotal_charge_per_atomdipole_per_atomfermi_level_per_atomenergy_per_atomforcesesps, if electrostatic potentials are returned and labelledfield_features, for field-feature supervision workflowsfixedpoint_scf_stability, for stability-oriented training experiments
For direct fixed-point training, atomic_multipoles is usually central. For
SCF stages, energy, forces, charge, dipole, and density losses are applied to
the model’s own SCF result.