Command-line interface#

This section describes the hydra based command-line interface (CLI) to DeepQMC. The tutorial exemplifies a basic training and evaluation through the command line. For more advanced functionality such as multiruns or interaction with slurm see the hydra docs.

The CLI provides simple access to the functionalities of the deepqmc package. The main tasks comprise train, restart and evaluate, which are thin wrappers around the train function.

Available tasks:

  • train: Trains the ansatz with variational Monte Carlo.

  • evaluate: Evaluates the total energy of an ansatz via Monte Carlo sampling.

  • restart: Restarts/continues the training from a stored training checkpoint.

The train function creates a directory which contains the logs as well as the hyperparameters for the training (.hydra). For restart and evaluate the restdir of the former training run has to be provided. Specifying arguments when executing the command will overwrite the configuration stored in the restdir. This enables changing certain parameters, such as the number of training / evaluation steps, but can result in errors if the requested hyperparameters conflict with the recovered train state.

Example#

A training can be run via:

$ deepqmc hydra.run.dir=workdir

This creates several files in the working directory, including:

  • training/events.out.tfevents.* - Tensorboard event file

  • training/result.h5 - HDF5 file with the training trajectory

  • training/state-*.pt - Checkpoint files with the saved state of the ansatz at particular steps

The training can be continued or recoverd from a training checkpoint:

$ deepqmc task=restart task.restdir=workdir/training

The evaluation of the energy of a trained wavefunction ansatz is obtained via:

$ deepqmc task=evaluate task.restdir=workdir/training

This again generates a Tensorboard event file evaluation/events.out.tfevents.* and an HDF5 file evaluation/result.h5 file holding the sampled local energies.

Execution on multiple GPUs#

DeepQMC can utilize multiple GPUs of a single host for increased performance. The algorithm is parallelised over the electron position samples, therefore the number of such samples in a batch (electron_batch_size) must be divisible with the number of utilized GPUs. DeepQMC relies on JAX to automatically detect and use all available GPUs, without any configuration from the user. It respects the CUDA_VISIBLE_DEVICES environment variable if it’s defined, and only uses the GPUs specified there. A short log message at the beginning of the run informs the user of the number of utilized GPUs.

Hyperparameters#

In the following the most relevant settings for running experiments with DeepQMC are discussed.

Task#

DeepQMC provides the above mentioned configurations for the train, evaluate and restart task. In order to override default hyperparameters of the experimental setup, such as the sample_size or the number of training steps or pretrain_steps, hydra provides a simple syntax:

$ deepqmc task=train task.electron_batch_size=2048 task.steps=50_000 task.pretrain_steps=1000

The working directory for logging and checkpointing is is defined through:

$ deepqmc hydra.run.dir=workdir

Note that the working directory of an evaluate and restart task cannot match the value of their restdir option (which specifies the working directory of the previous job that we want to evaluate or restart).

Hamiltonian#

DeepQMC aims at solving the molecular Hamiltonian. Molecules can be selected from a range of predefined configurations located in .../deepqmc/src/deepqmc/conf/hamil/mol:

$ deepqmc hamil/mol=LiH

The predefined configurations can be extended with custom molecules. Alternatively molecules can be specified on the command line:

$ deepqmc hamil.mol.coords=[[0,0,0],[0.742,0,0]] hamil.mol.charges=[1,1] hamil.mol.charge=0 hamil.mol.spin=0 hamil.mol.unit=angstrom

Furthermore, DeepQMC implements the option to use pseudopotentials, which can be used via:

$ deepqmc hamil.mol.coords=[[0,0,0]] hamil.mol.charges=[21] hamil.mol.charge=0 hamil.mol.spin=1 hamil.mol.unit=angstrom +hamil.pp_type='ccECP'

Sampling#

Different sampler configurations can be found in .../deepqmc/src/deepqmc/conf/task/sampler. A typical usecase would be to pick as sampler form these configurations and, if required, change some argument from the command line:

$ deepqmc task/sampler=decorr_langevin task.sampler.0.length=30

Optimization#

For the optimization either KFAC or optimizers from optax may be used. While the use of KFAC is highly recommended due to the significantly improved convergence, at times it can be useful to run with other optimizers such as AdamW:

$ deepqmc task/opt=adamw

Ansatz#

The hyperparameters of the training and the wave function ansatz are specified through hydra config files. Predefined ansatzes can be found in .../deepqmc/src/deepqmc/conf/ansatz and selected via:

$ deepqmc ansatz=ferminet

The hyperparameters of such a predefined ansatz can also be overwritten at the command line:

$ deepqmc ansatz=ferminet ansatz.omni_factory.gnn_factory.n_interactions=2

For convenience the configuration of the default ansatz is reproduced here:

_target_: deepqmc.wf.NeuralNetworkWaveFunction
_partial_: true
envelope:
  _target_: deepqmc.wf.nn_wave_function.env.ExponentialEnvelopes
  _partial_: true
  isotropic: true
  per_shell: false
  per_orbital_exponent: true
  spin_restricted: false
  init_to_ones: true
  softplus_zeta: false
backflow_op:
  _target_: deepqmc.wf.nn_wave_function.nn_wave_function.BackflowOp
  _partial_: true
  mult_act: '${eval:"lambda x: x"}'
n_determinants: 16
full_determinant: true
cusp_electrons:
  _target_: deepqmc.wf.nn_wave_function.cusp.ElectronicCuspAsymptotic
  _partial_: true
  same_scale: 0.25
  anti_scale: 0.5
  alpha: 10.0
  trainable_alpha: false
  cusp_function:
    _target_: deepqmc.wf.nn_wave_function.cusp.DeepQMCCusp
cusp_nuclei: false
backflow_transform: mult
conf_coeff:
  _target_: haiku.Linear
  _partial_: true
  with_bias: false
  w_init:
    _target_: jax.numpy.ones
    _partial_: true
omni_factory:
  _target_: deepqmc.wf.nn_wave_function.omni.OmniNet
  _partial_: true
  embedding_dim: 128
  jastrow_factory:
    _target_: deepqmc.wf.nn_wave_function.omni.Jastrow
    _partial_: true
    sum_first: true
    subnet_factory:
      _target_: deepqmc.hkext.MLP
      _partial_: true
      hidden_layers: ['log', 1]
      bias: false
      last_linear: true
      activation: null
      init: default
  backflow_factory:
    _target_: deepqmc.wf.nn_wave_function.omni.Backflow
    _partial_: true
    subnet_factory:
      _target_: deepqmc.hkext.MLP
      _partial_: true
      hidden_layers: ['log', 1]
      bias: false
      last_linear: true
      activation: null
      init: default
  gnn_factory:
    _target_: deepqmc.gnn.ElectronGNN
    _partial_: true
    n_interactions: 3
    nuclei_embedding: null
    electron_embedding:
      _target_: deepqmc.gnn.electron_gnn.ElectronEmbedding
      _partial_: true
      positional_embeddings:
        ne:
          _target_: deepqmc.gnn.edge_features.CombinedEdgeFeature
          features:
          - _target_: deepqmc.gnn.edge_features.DistancePowerEdgeFeature
            powers: [1]
          - _target_: deepqmc.gnn.edge_features.DifferenceEdgeFeature
      use_spin: false
      project_to_embedding_dim: false
    two_particle_stream_dim: 32
    self_interaction: false
    edge_features:
      same:
        _target_: deepqmc.gnn.edge_features.CombinedEdgeFeature
        features:
        - _target_: deepqmc.gnn.edge_features.DistancePowerEdgeFeature
          powers: [1]
        - _target_: deepqmc.gnn.edge_features.DifferenceEdgeFeature
      anti:
        _target_: deepqmc.gnn.edge_features.CombinedEdgeFeature
        features:
        - _target_: deepqmc.gnn.edge_features.DistancePowerEdgeFeature
          powers: [1]
        - _target_: deepqmc.gnn.edge_features.DifferenceEdgeFeature
    layer_factory:
      _target_: deepqmc.gnn.electron_gnn.ElectronGNNLayer
      _partial_: true
      subnet_factory:
        _target_: deepqmc.hkext.MLP
        _partial_: true
        hidden_layers: ['log', 2]
        bias: true
        last_linear: false
        activation:
          _target_: jax.numpy.tanh
          _partial_: true
        init: default
      subnet_factory_by_lbl:
        g:
          _target_: deepqmc.hkext.MLP
          _partial_: true
          hidden_layers: ['log', 1]
          bias: false
          last_linear: false
          activation:
            _target_: jax.numpy.tanh
            _partial_: true
          init: default
      one_particle_residual:
        _target_: deepqmc.hkext.ResidualConnection
        normalize: true
      two_particle_residual:
        _target_: deepqmc.hkext.ResidualConnection
        normalize: true
      deep_features: shared
      update_rule: concatenate
      update_features:
      - _target_: deepqmc.gnn.update_features.ResidualUpdateFeature
        _partial_: true
      - _target_: deepqmc.gnn.update_features.NodeSumUpdateFeature
        _partial_: true
        node_types: [up, down]
        normalize: true
      - _target_: deepqmc.gnn.update_features.ConvolutionUpdateFeature
        _partial_: true
        edge_types: [same, anti]
        normalize: false
        w_factory:
          _target_: deepqmc.hkext.MLP
          _partial_: true
          hidden_layers: ['log', 2]
          bias: true
          last_linear: false
          activation:
            _target_: jax.numpy.tanh
            _partial_: true
          init: default
        h_factory:
          _target_: deepqmc.hkext.MLP
          _partial_: true
          hidden_layers: ['log', 2]
          bias: true
          last_linear: false
          activation:
            _target_: jax.numpy.tanh
            _partial_: true
          init: default