Command-line interface#
This section describes the hydra based command-line interface (CLI) to DeepQMC. The tutorial exemplifies a basic training and evaluation through the command line. For more advanced functionality such as multiruns or interaction with slurm see the hydra docs.
The CLI provides simple access to the functionalities of the deepqmc
package. The main tasks comprise train
, restart
and evaluate
, which are thin wrappers around the train function.
Available tasks:
train
: Trains the ansatz with variational Monte Carlo.
evaluate
: Evaluates observables (i.e. the energy) of an ansatz via Monte Carlo sampling.
restart
: Restarts/continues the training from a stored training checkpoint.
The train function creates a directory which contains the logs as well as the hyperparameters for the training (.hydra
). For restart
and evaluate
the restdir of the former training run has to be provided. Specifying arguments when executing the command will overwrite the configuration stored in the restdir. This enables changing certain parameters, such as the number of training / evaluation steps, but can result in errors if the requested hyperparameters conflict with the recovered train state.
Basics#
A training can be run via:
$ deepqmc hydra.run.dir=workdir
This creates several files in the working directory, including:
deepqmc.log
- Stores the console log of the runtraining/events.out.tfevents.*
- Tensorboard event filetraining/result.h5
- HDF5 file with the training trajectorytraining/state-*.pt
- Checkpoint files with the saved state of the ansatz, optimizer and sampler at particular stepstraining/.hydra
- Folder containing the hydra config of the runtraining/pyscf_chkpts
- Folder containing the PySCF checkpoints for pretraining
The training can be continued or recoverd from a training checkpoint:
$ deepqmc task=restart task.restdir=workdir/training
The evaluation of the energy of a trained wavefunction ansatz is obtained via:
$ deepqmc task=evaluate task.restdir=workdir/training
This again generates a Tensorboard event file evaluation/events.out.tfevents.*
and an HDF5 file evaluation/result.h5
file holding the sampled local energies and other observables (see Tutorial/Logging)
Execution on multiple GPUs#
DeepQMC can utilize multiple GPUs of a single host for increased performance. The algorithm is parallelised over the electron position samples, therefore the number of such samples in a batch (electron_batch_size
) must be divisible with the number of utilized GPUs. DeepQMC relies on JAX to automatically detect and use all available GPUs, without any configuration from the user. It respects the CUDA_VISIBLE_DEVICES
environment variable if it’s defined, and only uses the GPUs specified there. A short log message at the beginning of the run informs the user of the number of utilized GPUs.
Hyperparameters#
In the following the most relevant settings for running experiments with DeepQMC are discussed.
Task#
DeepQMC provides the above mentioned configurations for the train
, evaluate
and restart
task. In order to override default hyperparameters of the experimental setup, such as the sample_size
or the number of training steps
or pretrain_steps
, hydra provides a simple syntax:
$ deepqmc task=train task.electron_batch_size=2048 task.steps=50000 task.pretrain_steps=5000
The working directory for logging and checkpointing is is defined through:
$ deepqmc hydra.run.dir=workdir
Note that the working directory of an evaluate
and restart
task cannot match the value of their restdir
option (which specifies the working directory of the previous job that we want to evaluate or restart).
Hamiltonian#
DeepQMC aims at solving the molecular Hamiltonian. Molecules can be selected from a range of predefined configurations located in .../deepqmc/src/deepqmc/conf/hamil/mol
:
$ deepqmc hamil/mol=LiH
The predefined configurations can be extended with custom molecules. Alternatively, simple molecules can be specified on the command line:
$ deepqmc hamil.mol.coords=[[0,0,0],[0.742,0,0]] hamil.mol.charges=[1,1] hamil.mol.charge=0 hamil.mol.spin=0 hamil.mol.unit=angstrom
To work with larger molecules, one can create custom YAML files (for examples check the .../deepqmc/src/deepqmc/conf/hamil/mol
folder) and load them with:
$ deepqmc hamil/mol=from_file hamil.mol.file=relative/path/to/molecule/file.yaml
DeepQMC implements the option to use pseudopotentials, which can be used via:
$ deepqmc hamil.mol.coords=[[0,0,0]] hamil.mol.charges=[21] hamil.mol.charge=0 hamil.mol.spin=1 hamil.mol.unit=angstrom +hamil.ecp_type='ccECP'
Excited States#
DeepQMC implements penalty-based optimisation of electronic excited states. To simulate the two lowest lying states of a molecule use:
$ deepqmc task.electronic_states=2
When simulating excited states it is recommended to pretrain with respect to orthogonal (excited) states. This is achieved by specifying a suitable cas space:
$ deepqmc task.electronic_states=2 task.pretrain_kwargs.scf_kwargs.cas=[2,2]
To target states of a particular spin sector, a spin penalty can be applied:
$ deepqmc task.electronic_states=2 +task.fit_fn.loss_function_factory.spin_penalty=10
Setting the spin penalty penalises high spin states, i.e. favours singlet (doublet) states over triplet (quartet) states, etc. When simulating states with higher total spin, the spin penalty is combined with setting the magnetic quantum number. For more details on the configuration of excited state calculations see [Szabo24]. Note that when applying cas pretraining and using the spin penalty it is required to fix the spin in the calculation of the baseline to provide sensible pretraining targets.
Sampling#
Different sampler configurations can be found in .../deepqmc/src/deepqmc/conf/task/sampler_factory
. A typical usecase would be to pick as sampler form these configurations and, if required, change some argument from the command line:
$ deepqmc task/sampler_factory=decorr_langevin task.sampler_factory.elec_sampler.samplers.0.length=30
Optimization#
For the optimization either KFAC or optimizers from optax may be used. While the use of KFAC is highly recommended due to the significantly improved convergence, at times it can be useful to run with other optimizers such as AdamW:
$ deepqmc task/opt=adamw
Ansatz#
The hyperparameters of the training and the wave function ansatz are specified through hydra config files. Predefined ansatzes can be found in .../deepqmc/src/deepqmc/conf/ansatz
and selected via:
$ deepqmc ansatz=psiformer
The hyperparameters of such a predefined ansatz can also be overwritten at the command line:
$ deepqmc ansatz=psiformer ansatz.omni_factory.gnn_factory.n_interactions=2
For convenience the configuration of the default
ansatz is reproduced here:
_target_: deepqmc.wf.NeuralNetworkWaveFunction
_partial_: true
envelope:
_target_: deepqmc.wf.env.ExponentialEnvelopes
_partial_: true
isotropic: true
per_shell: false
per_orbital_exponent: true
spin_restricted: false
init_to_ones: true
softplus_zeta: false
backflow_op:
_target_: deepqmc.wf.nn_wave_function.BackflowOp
_partial_: true
mult_act: '${eval:"lambda x: x"}'
n_determinants: 16
full_determinant: true
cusp_electrons:
_target_: deepqmc.wf.cusp.ElectronicCuspAsymptotic
_partial_: true
same_scale: 0.25
anti_scale: 0.5
alpha: 10.0
trainable_alpha: false
cusp_function:
_target_: deepqmc.wf.cusp.DeepQMCCusp
cusp_nuclei: false
backflow_transform: mult
conf_coeff:
_target_: haiku.Linear
_partial_: true
with_bias: false
w_init:
_target_: jax.numpy.ones
_partial_: true
omni_factory:
_target_: deepqmc.wf.omni.OmniNet
_partial_: true
embedding_dim: 128
jastrow_factory:
_target_: deepqmc.wf.omni.Jastrow
_partial_: true
sum_first: true
subnet_factory:
_target_: deepqmc.hkext.MLP
_partial_: true
hidden_layers: ['log', 1]
bias: false
last_linear: true
activation: null
init: default
backflow_factory:
_target_: deepqmc.wf.omni.Backflow
_partial_: true
subnet_factory:
_target_: deepqmc.hkext.MLP
_partial_: true
hidden_layers: ['log', 1]
bias: false
last_linear: true
activation: null
init: default
gnn_factory:
_target_: deepqmc.gnn.ElectronGNN
_partial_: true
n_interactions: 3
nuclei_embedding: null
electron_embedding:
_target_: deepqmc.gnn.electron_gnn.ElectronEmbedding
_partial_: true
positional_embeddings:
ne:
_target_: deepqmc.gnn.edge_features.CombinedEdgeFeature
features:
- _target_: deepqmc.gnn.edge_features.DistancePowerEdgeFeature
powers: [1]
- _target_: deepqmc.gnn.edge_features.DifferenceEdgeFeature
use_spin: false
project_to_embedding_dim: false
two_particle_stream_dim: 32
self_interaction: false
edge_features:
same:
_target_: deepqmc.gnn.edge_features.CombinedEdgeFeature
features:
- _target_: deepqmc.gnn.edge_features.DistancePowerEdgeFeature
powers: [1]
- _target_: deepqmc.gnn.edge_features.DifferenceEdgeFeature
anti:
_target_: deepqmc.gnn.edge_features.CombinedEdgeFeature
features:
- _target_: deepqmc.gnn.edge_features.DistancePowerEdgeFeature
powers: [1]
- _target_: deepqmc.gnn.edge_features.DifferenceEdgeFeature
layer_factory:
_target_: deepqmc.gnn.electron_gnn.ElectronGNNLayer
_partial_: true
subnet_factory:
_target_: deepqmc.hkext.MLP
_partial_: true
hidden_layers: ['log', 2]
bias: true
last_linear: false
activation:
_target_: jax.numpy.tanh
_partial_: true
init: default
subnet_factory_by_lbl:
g:
_target_: deepqmc.hkext.MLP
_partial_: true
hidden_layers: ['log', 1]
bias: false
last_linear: false
activation:
_target_: jax.numpy.tanh
_partial_: true
init: default
electron_residual:
_target_: deepqmc.hkext.ResidualConnection
normalize: true
nucleus_residual: null
two_particle_residual:
_target_: deepqmc.hkext.ResidualConnection
normalize: true
deep_features: shared
update_rule: concatenate
update_features:
- _target_: deepqmc.gnn.update_features.ResidualElectronUpdateFeature
_partial_: true
- _target_: deepqmc.gnn.update_features.NodeSumElectronUpdateFeature
_partial_: true
node_types: [up, down]
normalize: true
- _target_: deepqmc.gnn.update_features.ConvolutionElectronUpdateFeature
_partial_: true
edge_types: [same, anti]
normalize: false
w_factory:
_target_: deepqmc.hkext.MLP
_partial_: true
hidden_layers: ['log', 2]
bias: true
last_linear: false
activation:
_target_: jax.numpy.tanh
_partial_: true
init: default
h_factory:
_target_: deepqmc.hkext.MLP
_partial_: true
hidden_layers: ['log', 2]
bias: true
last_linear: false
activation:
_target_: jax.numpy.tanh
_partial_: true
init: default