API#

This is the documentation of the API of the deepqmc package.

This implementation uses the JAX library, the documentation for which can be found here:

class deepqmc.molecule.Molecule(*, coords, charges, charge, spin, unit='bohr', data=None)[source]#

Represents a molecule.

The array-like arguments accept anything that can be transformed to jax.Array.

Parameters:
  • coords (jax.Array | list[float]) – nuclear coordinates ((\(N_\text{nuc}\), 3), a.u.) as rows

  • charges (jax.Array | list[int | float]) – atom charges (\(N_\text{nuc}\))

  • charge (int) – total charge of a molecule

  • spin (int) – total spin multiplicity

  • unit (str) – units of the coordinates, either ‘bohr’ or ‘angstrom’

  • data (dict) – additional data stored with the molecule

classmethod from_file(file)[source]#

Create a molecule from a YAML file.

Parameters:

file (str) – path to the YAML file

classmethod from_name(name)[source]#

Create a molecule from a database of named molecules.

The available names are in Molecule.all_names.

Parameters:

name (str) – name of the molecule (one of Molecule.all_names)

class deepqmc.hamil.LaplacianFactory(*args, **kwargs)[source]#

Protocol class for Laplacian factories.

A Laplacian factory takes as input a function and returns a function that computes the laplacian and gradient of the input function

class deepqmc.hamil.MolecularHamiltonian(*, mol, ecp_type=None, ecp_mask=None, elec_std=1.0, laplacian_factory=<function laplacian>)[source]#

Hamiltonian of non-relativistic molecular systems.

The system consists of nuclei with fixed positions and electrons moving around them. The total energy is defined as the sum of the nuclear-nuclear and electron-electron repulsion, the nuclear-electron attraction, and the kinetic energy of the electrons: \(E=V_\text{nuc-nuc} + V_\text{el-el} + V_\text{nuc-el} + E_\text{kin}\).

Parameters:
  • mol (Molecule) – the molecule to consider

  • ecp_type (str) – If set, use the appropriate effective core potential (ECP). The string is passed to pyscf.gto.M() as 'ecp' argument. Supports ECPs that are implemented in the pyscf package, e.g. 'bfd' [Burkatzki et al. 2007] or 'ccECP' [Bennett et al. 2017].

  • ecp_mask (list[bool]) – list of True and False values (\(N_\text{nuc}\)) specifying whether to use an ECP for each nucleus.

  • elec_std (float) – optional, a default value of the scaling factor of the spread of electrons around the nuclei.

  • laplacian_factory (LaplacianFactory) – creates a function that returns a tuple containing the laplacian and gradient of the wave function.

as_pyscf(*, coords=None)[source]#

Return the hamiltonian parameters in format pyscf can parse.

Parameters:

coords (jax.Array) – optional, nuclear coordinates (\(N_\text{nuc}\), 3).

init_sample(rng, R, n, elec_std=None)[source]#

Guess some initial electron positions.

Tries to make an educated guess about plausible initial electron configurations. Places electrons according to normal distributions centered on the nuclei. If the molecule is not neutral, extra electrons are placed on or removed from random nuclei. The resulting configurations are usually very crude, a subsequent, thorough equilibration is needed.

Parameters:
  • rng (KeyArray) – key used for PRNG.

  • R (jax.Array) – nuclear coordinates of a single molecular geometry (\(N_\text{nuc}\), 3)

  • n (int) – the number of initial electron configurations to generate.

Returns:

initial electron and nuclei

configurations

Return type:

PhysicalConfiguration

Haiku#

The JAX implementation uses the haiku library to create, train and evaluate neural network models. Some additional neural network functionality is implemented in the package and documented here.

class deepqmc.hkext.GLU(*args, **kwargs)[source]#

Gated Linear Unit.

Parameters:
  • out_dim (int) – the output dimension.

  • name (str) – optional, the name of the network.

  • bias (bool) – optional, whether to include a bias term.

  • layer_norm_before (bool) – optional, whether to apply layer normalization before the GLU operation.

  • activation (Callable) – default is sigmoid, the activation function.

  • b_init (Callable) – default is zeros, the initialization function for the bias term.

class deepqmc.hkext.Identity(*args, **kwargs)[source]#

Represent the identity operation.

class deepqmc.hkext.MLP(*args, **kwargs)[source]#

Represent a multilayer perceptron.

Parameters:
  • out_dim (int) – the output dimension.

  • name (str) – optional, the name of the network.

  • hidden_layers (tuple) – optional, either (‘log’, \(N_\text{layers}\)), in which case the network will have \(N_\text{layers}\) layers with logarithmically changing widths, or a tuple of ints specifying the width of each layer.

  • bias (bool | str) –

    optional, specifies which layers should have a bias term. Possible values are

    • True: all layers will have a bias term

    • False: no layers will have a bias term

    • 'not_last': all but the last layer will have a bias term

  • last_linear (bool) – optional, if True the activation function is not applied to the activation of the last layer.

  • activation (Callable) – optional, the activation function.

  • init (str | Callable) –

    optional, specifies the initialization of the linear weights. Possible string values are:

    • 'default': the default haiku initialization method is used.

    • 'ferminet': the initialization method of the ferminet

      package is used.

    • 'deeperwin': the initialization method of the deeperwin

      package is used.

class deepqmc.hkext.ResidualConnection(*, normalize)[source]#

Represent a residual connection between pytrees.

The residual connection is only added if inp and update have the same shape.

Parameters:

normalize (-) – if True the sum of inp and update is normalized with sqrt(2).

class deepqmc.hkext.SumPool(out_dim, name=None)[source]#

Represent a global sum pooling operation.

Parameters:
  • out_dim (int) – the output dimension.

  • name (str) – optional, the name of the network.

deepqmc.hkext.ssp(x)[source]#

Compute the shifted softplus activation function.

Computes the elementwise function \(\text{softplus}(x)=\log(1+\text{e}^x)+\log\frac{1}{2}\)

Training and evaluation#

class deepqmc.types.TrainState(sampler, params, opt)[source]#

Represent the current state of the training.

opt#

Alias for field number 2

params#

Alias for field number 1

sampler#

Alias for field number 0

deepqmc.train.train(hamil, ansatz, opt, sampler_factory, steps, seed, electron_batch_size, molecule_batch_size=1, electronic_states=1, mols=None, workdir=None, train_state=None, init_step=0, max_restarts=3, max_eq_steps=1000, eq_allow_early_stopping=True, pretrain_steps=None, pretrain_kwargs=None, chkpt_constructor=None, metric_logger_constructor=None, h5_logger_constructor=None, merge_keys=None, loss_function_factory=None, observable_monitors=None)[source]#

Train or evaluate a JAX wave function model.

It initializes and equilibrates the MCMC sampling of the wave function ansatz, then optimizes or samples it using the variational principle. It optionally saves checkpoints and rewinds the training/evaluation if an error is encountered. If an optimizer is supplied, the Ansatz is optimized, otherwise the Ansatz is only sampled.

Parameters:
  • hamil (MolecularHamiltonian) – the Hamiltonian of the physical system.

  • ansatz (Ansatz) – the wave function Ansatz.

  • opt (kfac_jax or optax optimizers | None) –

    the optimizer. Possible values are:

    • kfac_jax.Optimizer:

      the partially initialized KFAC optimizer is used

    • an optax optimizer instance:

      the supplied optax optimizer is used.

    • None:

      no optimizer is used, e.g. the evaluation of the Ansatz is performed.

  • sampler_factory (Callable) – a function that returns a Sampler instance

  • steps (int) – number of optimization steps.

  • seed (int) – the seed used for PRNG.

  • electron_batch_size (int) – the number of electron samples considered in a batch

  • molecule_batch_size (int) – optional, the number of molecules considered in a batch. Only needed for transferable training.

  • electronic_states (int) – optional, the number of electronic states to consider.

  • mols (list[Molecule]) – optional, a sequence of molecules to consider for transferable training. If None the default molecule from hamil is used.

  • workdir (str) – optional, path, where results should be saved.

  • train_state (TrainState) – optional, training checkpoint to restore training or run evaluation.

  • init_step (int) – optional, initial step index, useful if calculation is restarted from checkpoint saved on disk.

  • max_restarts (int) – optional, the maximum number of times the training is retried before a NaNError is raised.

  • max_eq_steps (int) – optional, maximum number of equilibration steps if not detected earlier.

  • eq_allow_early_stopping (bool) – default True, whether to allow the equilibration to stop early when the equilibration criterion has been met.

  • pretrain_steps (int) – optional, the number of pretraining steps wrt. to the Baseline wave function obtained with pyscf.

  • pretrain_kwargs (dict) – optional, extra arguments for pretraining.

  • chkpt_constructor (Type[CheckpointStore]) – optional, an object that saves training checkpoints to the working directory.

  • metric_logger_constructor (Type[MetricLogger]) – optional, an object that consumes metric logging information. If not specified, the default TensorboardMetricLogger is used to create tensorboard logs.

  • h5_logger_constructor (Type[H5Logger]) – optional, an object that consumes metric logging information. If not specified, the default H5Logger is used to write comprehensive training statistics to an h5file.

  • merge_keys (list[str]) – optional, list of strings for selecting parameters to be shared across electronic states. Matching merge keys with (substrings of) parameter keys.

  • loss_function_factory (LossFunctionFactory) – optional, a callable returning a suitable loss function for the optimization.

  • observable_monitors (list[ObservableMonitor]) – optional, list of observable monitors to be evaluated during training or evaluation.

Sampling#

deepqmc.types.SamplerState#

alias of dict

class deepqmc.sampling.base.ElectronSampler(*args, **kwargs)[source]#

Protocol for ElectronSampler objects.

ElectronSampler objects implement Markov chain samplers for the electron positions. The samplers are assumed to implement a batch of walkers for a single electronic state on a single molecule and may be vmapped to fit the respective context they are used in. Electron samplers can be combined with chain().

init(rng, params, n, R)[source]#

Initializes the sampler state.

Parameters:
  • rng (KeyArray) – an rng key for the initialization of electron positions.

  • params (Params) – the parameters of the wave function that is being sampled.

  • n (int) – the number of walkers to propagate in parallel.

  • R (jax.Array) – the nuclei positions of the molecular configuration.

Returns:

the sampler state holding electron positions and data about the sampler trajectory.

Return type:

SamplerState

sample(rng, state, params, R)[source]#

Propagates the sampler state.

Parameters:
  • rng (KeyArray) – an rng key for the proposal of electron positions.

  • state (SamplerState) – the state of the sampler from the previous step.

  • params (Params) – the parameters of the wave function that is being sampled.

  • R (jax.Array) – the nuclei positions of the molecular configuration.

Returns:

tuple[SamplerState, PhysicalConfiguration, Stats]: the new sampler state, a physical configuration and statistics about the sampling trajectory.

update(state, params, R)[source]#

Updates the sampler state.

The sampler state is updated to account for changes in the wave function due to a parameter update.

Parameters:
  • state (SamplerState) – the state of the sampler before parameter update.

  • params (Params) – the new parameters of the wave function.

  • R (jax.Array) – the nuclei positions of the molecular configuration.

Returns:

the updated sampler state holding electron positions and data about the sampler trajectory.

Return type:

SamplerState

class deepqmc.sampling.DecorrSampler(*, length)[source]#

Insert decorrelating steps into chained samplers.

This sampler cannot be used as the last element of a sampler chain.

Parameters:

length (int) – the samples will be taken in every length MCMC step, that is, length \(-1\) decorrelating steps are inserted.

class deepqmc.sampling.LangevinSampler(hamil, wf, *, tau=1.0, target_acceptance=0.57, max_age=None)[source]#

Metropolis adjusted Langevin Monte Carlo sampler.

Derived from MetropolisSampler.

Parameters:
  • hamil (MolecularHamiltonian) – the Hamiltonian of the physical system.

  • wf – the apply method of the haiku transformed ansatz object.

  • tau (float) – optional, the proposal step size scaling factor. Adjusted during every step if target_acceptance is specified.

  • target_acceptance (float) – optional, if specified the proposal step size will be scaled such that the ratio of accepted proposal steps approaches target_acceptance.

  • max_age (int) – optional, if specified the next proposed step will always be accepted for a walker that hasn’t moved in the last max_age steps.

class deepqmc.sampling.MetropolisSampler(hamil, wf, *, tau=1.0, target_acceptance=0.57, max_age=None)[source]#

Metropolis–Hastings Monte Carlo sampler.

The sample() method of this class returns electron coordinate samples from the distribution defined by the square of the sampled wave function.

Parameters:
  • hamil (MolecularHamiltonian) – the Hamiltonian of the physical system.

  • wf – the apply method of the haiku transformed ansatz object.

  • tau (float) – optional, the proposal step size scaling factor. Adjusted during every step if target_acceptance is specified.

  • target_acceptance (float) – optional, if specified the proposal step size will be scaled such that the ratio of accepted proposal steps approaches target_acceptance.

  • max_age (int) – optional, if specified the next proposed step will always be accepted for a walker that hasn’t moved in the last max_age steps.

class deepqmc.sampling.MoleculeIdxSampler(rng, n_mols, batch_size, shuffle=False)[source]#

Sample molecule indexes for transferable training.

class deepqmc.sampling.MultiNuclearGeometrySampler(elec_sampler, nuc_sampler, warp_elec_fn, update_nuc_period, elec_equilibration_steps)[source]#

This sampler samples from multiple nuclear geometries in parallel.

Parameters:
  • elec_sampler (ElectronSampler) – the electronic sampler to use

  • nuc_sampler (NucleiSampler) – the nuclei sampler to use.

  • warp_elec_fn (ElectronWarp) – the function that warps the electrons to the new nuclear geometry.

  • update_nuc_period (int) – optional, the number of steps between nuclear updates.

  • elec_equilibration_steps (int) – optional, the number of steps to equilibrate the electronic state after a nuclear update.

class deepqmc.sampling.ResampledSampler(*, period=None, threshold=None)[source]#

Add resampling to chained samplers.

This sampler cannot be used as the last element of a sampler chain. The resampling is performed by accumulating weights on each MCMC walker in each step. Based on a fixed resampling period period and/or a threshold threshold on the normalized effective sample size the walker positions are sampled according to the multinomial distribution defined by these weights, and the weights are reset to one. Either period or threshold have to be specified.

Parameters:
  • period (int) – optional, if specified the walkers are resampled every period MCMC steps.

  • threshold (float) – optional, if specified the walkers are resampled if the effective sample size normalized with the batch size is below threshold.

deepqmc.sampling.chain(*samplers)[source]#

Combine multiple sampler types, to create advanced sampling schemes.

For example chain(DecorrSampler(10),MetropolisSampler(hamil, tau=1.)) will create a MetropolisSampler, where the samples are taken from every 10th MCMC step. The last element of the sampler chain has to be either a MetropolisSampler or a LangevinSampler.

Parameters:

samplers (ElectronSampler) – one or more sampler instances to combine.

Returns:

the combined sampler.

Return type:

ElectronSampler

deepqmc.sampling.combine_samplers(samplers, hamil, wf)[source]#

Combine samplers to create more advanced sampling schemes.

Parameters:
  • samplers (list[ElectronSampler]) – one or more sampler instances to combine.

  • hamil (MolecularHamiltonian) – the molecular Hamiltonian.

  • wf (ParametrizedWaveFunction) – the wave function to sample.

Optimizers#

class deepqmc.optimizer.Optimizer(loss_and_grad_fn, merge_keys=None)[source]#

Protocol for Optimizer objects.

init(rng, params, batch)[source]#

Initialize the optimizer state.

Parameters:
  • rng (KeyArray) – the RNG key used to initialize random components the of optimizer state.

  • params (Params) – the parameters of the wave function ansatz/ansatzes to be optimized during training.

  • batch (Batch) – a tuple containing a physical configuration, a set of sample weights and auxiliary data.

Returns:

the initial state of the optimizer

Return type:

OptState

step(rng, params, opt_state, batch)[source]#

Perform an optimization step.

Parameters:
  • rng (KeyArray) – the RNG key for the optimizer update.

  • params (Params) – the current parameters of the wave function ansatz/ansatzes.

  • opt_state (OptState) – the current state of the optimizer

  • batch (Batch) – a tuple containing a physical configuration, a set of sample weights and auxiliary data.

Returns:

tuple[~deepqmc.types.Params, ~deepqmc.types.OptState, ~deepqmc.types.Energy, jax.Array | None, ~deepqmc.types.Stats]: the new model parameters, an updated optimizer state, the energies obtained during the evaluation of the loss function, if applicable the wave function ratios obtained during the evaluation of the loss dunction and further statistics.

Wave functions#

class deepqmc.types.Psi(sign, log)[source]#

Represent wave function values.

The sign and log of the absolute value of the wave function are stored.

log#

Alias for field number 1

sign#

Alias for field number 0

deepqmc.types.WaveFunction#

alias of Callable[[pytree_dataclass], Psi]

deepqmc.types.ParametrizedWaveFunction#

alias of Callable[[MutableMapping, pytree_dataclass], Psi]

class deepqmc.types.Ansatz(*args, **kwargs)[source]#

Protocol for ansatz objects.

Ansatz objects represent a parametrized wave function Ansatz. New types of Ansatzes should implement this protocol to be compatible with the DeepQMC software suite. It is assumed that Ansatzes take as input a PhysicalConfiguration for a single sample of electron and nuclei configuration. To handle batches of samples, e.g. during training, the Ansatz is vmap-ed automatically by DeepQMC.

apply(params, phys_conf, return_mos=False)[source]#

Evaluate the Ansatz.

Parameters:
  • params (Params) – the current parameters with which to evaluate the Ansatz.

  • phys_conf (PhysicalConfiguration) – a single sample on which to evaluate the Ansatz.

  • return_mos (bool) – whether to return the many-body orbitals instead of the wave function.

Returns:

the value of the wave function.

Return type:

Psi

init(rng, phys_conf)[source]#

Initialize the parameters of the Ansatz.

Parameters:
  • rng (KeyArray) – the RNG key used to generate the initial parameters.

  • phys_conf (PhysicalConfiguration) – a dummy input to the network of a single electron and nuclei configuration. The value of this can be anything, only its shape information is read.

Returns:

the initial parameters of the Ansatz.

Return type:

Params

class deepqmc.wf.NeuralNetworkWaveFunction(hamil, *, omni_factory, envelope, backflow_op, n_determinants, full_determinant, cusp_electrons, cusp_nuclei, backflow_transform, conf_coeff)[source]#

Implements the neural network wave function.

Configuration files to obtain the PauliNet [HermannNC20], FermiNet [PfauPRR20], DeepErwin [Gerard22] and PsiFormer [Glehn22] architectures are provided. For a detailed description of the implemented architecture see [Schaetzle23].

Parameters:
  • hamil (MolecularHamiltonian) – the Hamiltonian of the system.

  • omni_factory (Callable) – creates the omni net.

  • envelope (ExponentialEnvelopes) – the orbital envelopes.

  • backflow_op (Callable) – specifies how the backflow is applied to the orbitals.

  • n_determinants (int) – specifies the number of determinants

  • full_determinant (bool) – if False, the determinants are factorized into spin-up and spin-down parts.

  • cusp_electrons (Callable) – constructor of the electronic cusp module.

  • cusp_nuclei (Callable) – constructor of the nuclear cusp module.

  • backflow_transform (str) –

    describes the backflow transformation. Possible values:

    • 'mult': the backflow is a multiplicative factor

    • 'add': the backflow is an additive term

    • 'both': the backflow consist of a multiplicative factor

      and an additive term

  • conf_coeff (Callable) – returns a function that combines the determinants to obtain the WF value

Graph neural networks#

A graph neural network is the most important component of the neural network wave function ansatz. This module implements a general gnn framework, that can be configured to obtain a variety of different ansatzes.

Graphs#

This submodule implements the basic functionality for working with graphs.

deepqmc.gnn.graph.GraphEdgeBuilder(mask_self)[source]#

Create a function that builds graph edges.

Parameters:

mask_self (bool) – whether to mask edges that begin and end in the same node.

deepqmc.gnn.graph.GraphUpdate(aggregate_edges_for_nodes_fn, update_nodes_fn=None, update_edges_fn=None)[source]#

Create a function that updates a graph.

The update function is tailored to be used in GNNs.

Parameters:
  • aggregate_edges_for_nodes_fn (bool) – whether to perform the aggregation of edges for nodes.

  • update_nodes_fn (Callable) – optional, function that updates the nodes.

  • update_edges_fn (Callable) – optional, function that updates the edges.

deepqmc.gnn.graph.MolecularGraphEdgeBuilder(n_nuc, n_up, n_down, edge_types, *, self_interaction)[source]#

Create a function that builds many types of molecular edges.

Parameters:
  • n_nuc (int) – number of nuclei.

  • n_up (int) – number of spin-up electrons.

  • n_down (int) – number of spin-down electrons.

  • edge_types (list[str]) –

    list of edge type names to build. Possible names are:

    • 'nn': nuclei->nuclei edges

    • 'ne': nuclei->electrons edges

    • 'en': electrons->nuclei edges

    • 'same': edges between same-spin electrons

    • 'anti': edges between opposite-spin electrons

    • 'up': edges going from spin-up electrons to all electrons

    • 'down': edges going from spin-down electrons to all electrons

  • self_interaction (bool) – whether edges between a particle and itself are considered

class deepqmc.gnn.edge_features.CombinedEdgeFeature(*, features)[source]#

Combine multiple edge features.

Parameters:

features (list[EdgeFeature]) – a list of edge feature objects to combine.

class deepqmc.gnn.edge_features.DifferenceEdgeFeature(*, log_rescale=False)[source]#

Return the difference vector as the edge features.

Parameters:

log_rescale (bool) – whether to rescale the features by \(\log(1 + d) / d\) where \(d\) is the length of the edge.

class deepqmc.gnn.edge_features.DistancePowerEdgeFeature(*, powers, eps=None, log_rescale=False)[source]#

Return powers of the distance as edge features.

Parameters:
  • powers (list[float]) – a list of powers to apply to the edge length.

  • eps (float | None) – a small value to add to the denominator when the power is negative.

  • log_rescale (bool) – whether to rescale the features by \(\log(1 + d) / d\) where \(d\) is the length of the edge.

class deepqmc.gnn.edge_features.EdgeFeature(*args, **kwargs)[source]#

Base class for all edge features.

class deepqmc.gnn.edge_features.GaussianEdgeFeature(*, n_gaussian, radius, offset)[source]#

Expand the distance in a Gaussian basis.

Parameters:
  • n_gaussian (int) – the number of gaussians to use, consequently the length of the feature vector

  • radius (float) – the radius within which to place gaussians

  • offset (bool) – whether to offset the position of the first Gaussian from zero.

Electron GNN#

This submodule provides the ElectronGNN architecture for defining neural network parametrized functions acting on graphs of electrons and nuclei.

class deepqmc.gnn.electron_gnn.ElectronEmbedding(n_nuc, n_up, n_down, embedding_dim, n_elec_types, elec_types, *, positional_embeddings, use_spin, project_to_embedding_dim)[source]#

Create initial embeddings for electrons.

Parameters:
  • n_nuc (int) – the number of nuclei.

  • n_up (int) – the number of spin up electrons.

  • n_down (int) – the number of spin down electrons.

  • embedding_dim (int) – the desired length of the embedding vectors.

  • n_elec_types (int) –

    the number of electron types to differentiate. Usual values are:

    • 1: treat all electrons as indistinguishable. Note that electrons

      with different spins can still become distinguishable during the later embedding update steps of the GNN.

    • 2: treat spin up and spin down electrons as distinguishable already

      in the initial embeddings.

  • elec_types (jax.Array) – an integer array with length equal to the number of electrons, with entries between 0 and n_elec_types. Specifies the type for each electron.

  • positional_embeddings (dict) – optional, if not None, a dict with edge types as keys, and edge features as values. Specifies the edge types and edge features to use when constructing the positional initial electron embeddings.

  • use_spin (bool) – only relevant if positional_embeddings is not False, if True, concatenate the spin of the given electron after the positional embedding features.

  • project_to_embedding_dim (bool) – only relevant if positional_embeddings is not False, if True, use a linear layer to project the initial embeddings to have length embedding_dim.

class deepqmc.gnn.electron_gnn.ElectronGNN(hamil, embedding_dim, *, n_interactions, edge_features, self_interaction, two_particle_stream_dim, nuclei_embedding, electron_embedding, layer_factory, ghost_coords=None)[source]#

A neural network acting on graphs defined by electrons and nuclei.

Derived from GraphNeuralNetwork.

Parameters:
  • hamil (MolecularHamiltonian) – the Hamiltonian of the system on which the graph is defined.

  • embedding_dim (int) – the length of the electron embedding vectors.

  • n_interactions (int) – number of message passing interactions.

  • edge_features (dict) –

    a dict of functions for each edge type, embedding the interparticle differences. Valid keys are:

    • 'ne': for nucleus-electron edges

    • 'nn': for nucleus-nucleus edges

    • 'same': for same spin electron-electron edges

    • 'anti': for opposite spin electron-electron edges

    • 'up': for edges going from spin up electrons to all electrons

    • 'down': for edges going from spin down electrons to all electrons

  • self_interaction (bool) – whether to consider edges where the sender and receiver electrons are the same.

  • two_particle_stream_dim (int) – the feature dimension of the two particle streams. Only active if deep_features are used.

  • nuclei_embedding (Type[NucleiEmbedding]) – optional, the instance responsible for creating the initial nuclear embeddings. Set to None if nuclear embeddings are not needed.

  • electron_embedding (Type[ElectronEmbedding]) – the instance that creates the initial electron embeddings.

  • layer_factory (Type[ElectronGNNLayer]) – a callable that generates a layer of the GNN.

  • ghost_coords (jax.Array) – optional, specifies the coordinates of one or more ghost atoms, useful for breaking spatial symmetries of the nuclear geometry.

edge_factory(phys_conf)[source]#

Compute all the graph edges used in the GNN.

class deepqmc.gnn.electron_gnn.ElectronGNNLayer(n_interactions, ilayer, n_nuc, n_up, n_down, embedding_dim, edge_types, self_interaction, node_data, two_particle_stream_dim, *, electron_residual, nucleus_residual, two_particle_residual, deep_features, update_features, update_rule, subnet_factory=None, subnet_factory_by_lbl=None)[source]#

The message passing layer of ElectronGNN.

Derived from MessagePassingLayer.

Parameters:
  • n_interactions (int) – the number of message passing interactions.

  • ilayer (int) – the index of this layer (0 <= ilayer < n_interactions).

  • n_nuc (int) – the number of nuclei.

  • n_up (int) – the number of spin up electrons.

  • n_down (int) – the number of spin down electrons.

  • embedding_dim (int) – the length of the electron embedding vectors.

  • edge_types (Tuple[str]) – the types of edges to consider.

  • self_interaction (bool) – whether to consider edges where the sender and receiver electrons are the same.

  • node_data (Dict[str, Any]) – a dictionary containing information about the nodes of the graph.

  • two_particle_stream_dim (int) – the feature dimension of the two particle streams.

  • electron_residual – whether a residual connection is used when updating the electron embeddings, either False, or an instance of ResidualConnection.

  • nucleus_residual – whether a residual connection is used when updating the nucleus embeddings, either False, or an instance of ResidualConnection.

  • two_particle_residual – whether a residual connection is used when updating the two particle embeddings, either False, or an instance of ResidualConnection.

  • deep_features – if False, the edge features are not updated throughout the GNN layers, if shared than in each layer a single MLP (u) is used to update all edge types, if separate then in each layer separate MLPs are used to update the different edge types.

  • update_features (list[UpdateFeature]) – a list of partially initialized update feature classes to use when computing the update features of the one particle embeddings. For more details see the documentation of deepqmc.gnn.update_features.

  • update_rule (str) –

    how to combine the update features for the update of the one particle embeddings. Possible values:

    • 'concatenate': run concatenated features through MLP

    • 'featurewise': apply different MLP to each feature channel and sum

    • 'featurewise_shared': apply the same MLP across feature channels

    • 'sum': sum features before sending through an MLP

    note that 'sum' and 'featurewise_shared' imply features of same size.

  • subnet_factory (Callable) – optional, a function that constructs the subnetworks of the GNN layer.

  • subnet_factory_by_lbl (dict) – optional, a dictionary of functions that construct subnetworks of the GNN layer. If both this and subnet_factory is specified, the specified values of subnet_factory_by_lbl will take precedence. If some keys are missing, the default value of subnet_factory will be used in their place. Possible keys are: (w, h, g or u).

class deepqmc.gnn.electron_gnn.NucleiEmbedding(n_up, n_down, charges, n_atom_types, *, embedding_dim, atom_type_embedding, subnet_type, edge_features)[source]#

Create initial embeddings for nuclei.

Parameters:
  • n_up (int) – the number of spin up electrons.

  • n_down (int) – the number of spin down electrons.

  • charges (jax.Array) – the nuclear charges of the molecule.

  • n_atom_types (int) – the number of different atom types in the molecule.

  • embedding_dim (int) – the length of the output embedding vector

  • atom_type_embedding (bool) – if True, initial embeddings are the same for atoms of the same type (nuclear charge), otherwise they are different for all nuclei.

  • subnet_type (str) – the type of subnetwork to use for the embedding generation: - 'mlp': an MLP is used - 'embed': a haiku.Embed block is used

  • edge_features (EdgeFeature) – optional, the edge features to use when constructing the initial nuclear embeddings.

Update Features#

This submodule implements some common ways to compute update features for the node embeddings from the current node and edge embeddings. Instances of the below classes are callable, they take as input the current node and edge representations, and output a list of update features to be used for updating the node representations.

class deepqmc.gnn.update_features.ConvolutionElectronUpdateFeature(*args, edge_types, normalize, w_factory, h_factory, w_for_ne=True)[source]#

The convolution of node and edge embeddings as an update feature.

Returns the convolution of the node and edge embeddings for various edge types as separate update features.

Parameters:
  • n_up (int) – number of spin up electrons

  • n_down (int) – number of spin down electrons

  • two_particle_stream_dim (int) – dimension of the two-particle stream

  • node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.

  • edge_types (list[str]) – list of edge types to sum over

  • normalize (bool) – whether to normalize the sum by the number of senders

  • w_factory (Callable) – factory function for the \(w\) matrix

  • h_factory (Callable) – factory function for the \(h\) matrix

  • w_for_ne (bool) – whether to use the \(w\) matrix for the \(ne\) edge type

class deepqmc.gnn.update_features.EdgeSumElectronUpdateFeature(*args, edge_types, normalize)[source]#

The (normalized) sum of the edge embeddings as an update feature.

Returns the (normalized) sum of the edge embeddings for various edge types as separate update features.

Parameters:
  • n_up (int) – number of spin up electrons

  • n_down (int) – number of spin down electrons

  • two_particle_stream_dim (int) – dimension of the two-particle stream

  • node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.

  • edge_types (list[str]) – list of edge types to sum over

  • normalize (bool) – whether to normalize the sum by the number of senders

class deepqmc.gnn.update_features.NodeAttentionElectronUpdateFeature(*args, num_heads, mlp_factory, attention_residual, mlp_residual)[source]#

Create a single update feature by attenting over the nodes.

Returns the Psiformer update feature based on attention over the nodes.

Parameters:
  • n_up (int) – number of spin up electrons

  • n_down (int) – number of spin down electrons

  • two_particle_stream_dim (int) – dimension of the two-particle stream

  • node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.

  • num_heads (int) – number of attention heads

  • mlp_factory (Type[MLP]) – factory function for the MLP

  • attention_residual (Optional[Residual]) – optional residual connection after the attention layer

  • mlp_residual (Optional[Residual]) – optional residual connection after the MLP layer

class deepqmc.gnn.update_features.NodeSumElectronUpdateFeature(*args, node_types, normalize)[source]#

The (normalized) sum of the node embeddings as an update feature.

Returns the (normalized) sum of the electron embeddings from the previous layer as a single update feature.

Parameters:
  • n_up (int) – number of spin up electrons

  • n_down (int) – number of spin down electrons

  • two_particle_stream_dim (int) – dimension of the two-particle stream

  • node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.

  • node_types (list[str]) – list of node types to update

  • normalize (bool) – whether to normalize the sum by the number of nodes

class deepqmc.gnn.update_features.ResidualElectronUpdateFeature(n_up, n_down, two_particle_stream_dim, node_edge_mapping)[source]#

Residual update feature.

Returns the unchanged electron embeddings from the previous layer as a single update feature.

class deepqmc.gnn.update_features.UpdateFeature(n_up, n_down, two_particle_stream_dim, node_edge_mapping)[source]#

Base class for all update features.

Parameters:
  • n_up (int) – number of spin up electrons

  • n_down (int) – number of spin down electrons

  • two_particle_stream_dim (int) – dimension of the two-particle stream

  • node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.

class deepqmc.gnn.utils.NodeEdgeMapping(edges, node_data=None)[source]#

A utility mapping between the various node and edge types.

For example it is often useful determine the sender/receiver node type of a given edge, or generate all the edge types with a given sender/receiver node.

Parameters:
  • edges (Sequence[str]) – all the edge types present in the graph

  • node_data (dict) – optional, data to store for the node types

Excited electronic states#

This section documents the API used to treat electronic excited states.

class deepqmc.sampling.combined_samplers.MultiElectronicStateSampler(sampler, n_state)[source]#

Sample from multiple electronic states in parallel.

This sampler applies vmap to an underlying ElectronSampler to sample from multiple electronic states in parallel.

Parameters:
  • sampler (ElectronSampler) – the electron sampler to use.

  • n_state (int) – the number of electronic states to sample from.

class deepqmc.loss.overlap.OverlapGradientScaleFactory(*args, **kwargs)[source]#

Callable that computes the scaling factor of the overlap gradient.

deepqmc.loss.overlap.compute_mean_overlap(psi_ratio, weight)[source]#

Compute an estimate of the overlap matrix from WF ratios.

Parameters:
  • psi_ratio (jax.Array) – the WF ratios \(\text{ratio}[i,\,j,\,:]=\frac{\Psi_i(r\sim\Psi^2_j)}{\Psi_j(r\sim\Psi^2_j)}\), shape: [n_wfs, n_wfs, elec_batch_size].

  • weight (Weight) – the sample weights, shape [n_wfs, elec_batch_size].

Returns:

tuple of the symmetric overlap matrix estimate, shaped [mol_batch_size, n_wfs, n_wfs], and overlap statistics.

Return type:

tuple[jax.Array, Stats]

deepqmc.loss.overlap.compute_mean_overlap_tangent(psi_ratio, weight, log_psi_tangent, ratio_gradient_mask, overlap, scale_factory, data)[source]#

Compute the tangent of the overlap matrix with respect to the Ansatz parameters.

Parameters:
  • psi_ratio (jax.Array) – the ratio of WF values.

  • weight (Weight) – the weight of each sample.

  • log_psi_tangent (jax.Array) – the jvp of the WF values with respect to the parameters of the Ansatz.

  • ratio_gradient_mask (jax.Array) – a samplewise boolean mask to apply to the gradients.

  • overlap (jax.Array) – the overlap matrix estimate.

  • scale_factory (OverlapGradientScaleFactory) – function that computes the scaling factor of the overlap gradient.

  • data (DataDict) – input data passed to the scale_factory function.

Returns:

the jvp of the sum of the upper triangle of the overlap matrix with

respect to the Ansatz parameters.

Return type:

jax.Array

deepqmc.loss.overlap.compute_psi_ratio(ansatz, params, phys_conf)[source]#

Compute the ratio of all wave function for a batch of samples.

Parameters:
  • ansatz (Ansatz) – the ansatz object.

  • params (Params) – the current parameters of the Ansatz.

  • phys_conf (PhysicalConfiguration) – the input to the Ansatz, shape: [mol_batch_size, electronic_states, electron_batch_size, ...].

Returns:

the tuple of the WF ratios and overlap

statistics.

Return type:

tuple[jax.Array, Stats]

deepqmc.loss.overlap.compute_single_sample_psi_ratios(psi, mean_log_psi)[source]#

Compute all possible ratios between the WFs for a single sample.

The mean of each WF value is subtracted before computing exponentials to avoid over/underflow.

Parameters:
  • psi (Psi) – all WF values for a single molecule and electron sample, shape: [electronic_states, electronic_states]. mean_log_psi (jax.Array): mean log magnitude of the WFs, [electronic_states]

  • mean_log_psi (jax.Array) – the mean log psi values for each state, shape [electronic_states]

Returns:

the WF ratios

\(R[i,\,j]=\frac{\Psi_i(r\sim\Psi^2_j)}{\Psi_j(r\sim\Psi^2_j)}\), shape [electronic_states, electronic_states].

Return type:

jax.Array

deepqmc.loss.overlap.compute_wave_function_values(ansatz, params, phys_conf)[source]#

Compute the value of all WFs at samples drawn from all WFs.

Parameters:
  • params (dict) – PyTree of WF parameters, with leading axis over the different WFs. Shape: [n_wfs, ...].

  • phys_conf (PhysicalConfiguration) – input physical configuration samples, with leading axis over the different WFs the samples were drawn from. Shape: [n_wfs, elec_batch_size, ...]

Returns:

the WF values

\(\Psi[i, \, j, \, :] = \Psi_i({\bf r} \sim \Psi^2_j)\), shape: [n_wfs, n_wfs, elec_batch_size].

Return type:

Psi

deepqmc.loss.overlap.no_scaling(data)[source]#

Return unit scaling factor.

deepqmc.loss.overlap.scale_by_energy_gap(data, min_gap_scale_factor=0.1)[source]#

Scale the overlap gradient by the energy gap between the two states.

deepqmc.loss.overlap.scale_by_energy_std(data, min_gap_scale_factor=0.01)[source]#

Scale the overlap gradient by the std. dev. of the states’ energies.

deepqmc.loss.overlap.scale_by_max_gap_std(data, min_gap_scale_factor=0.1)[source]#

Scale the overlap gradient by the max of the energy gap and std. dev.

deepqmc.loss.overlap.symmetrize_overlap_with_clipped_geometric_mean(x)[source]#

Symmetrize the overlap using the clipped geometric mean of it and its transpose.

Useful for computing an estimate of the overlap matrix using Monte Carlo samples from all WFs. Given input \(x_{ij}\) this function computes:

\[y_{ij}=\text{sign}(x_{ij}) \sqrt{\text{clamp}(0, \, x_{ij} \cdot x_{ji}, \, 1)} \]

Note that if the signs of \(x_{ij}\) and \(x_{ji}\) differ, the clamping makes sure that \(y_{ij}\) will be zero. Otherwise the two signs will agree, and we can use either one of them to compute the sign of \(y_{ij}\).

Parameters:

x – the non-symmetric overlap values: \(x_{ij}=\frac1N\sum_{{\bf r}_j\sim\Psi^2_j}\frac{\Psi_i({\bf r}_j)} {\Psi_j({\bf r}_j)}\)

Custom data types and type aliases#

In order to facilitate the use of the API and enable type checking DeepQMC implements a range of custom types and type aliases.

A combination of electron and nuclei positions is assigned the PhysicalConfiguration type.

class deepqmc.types.PhysicalConfiguration#

Represent physical configurations of electrons and nuclei.

It currently contains the nuclear and electronic coordinates, along with mol_idx, which specifies which nuclear configuration a given sample was obtained from.

Wave function parameters are denoted with the Params type.

deepqmc.types.Params#

alias of MutableMapping

Auxiliary data need for the evaluation of the loss comes as a DataDict type.

deepqmc.types.DataDict#

alias of dict

Statistics obtained during training or evaluation use the Stats type.

deepqmc.types.Stats#

alias of dict

Data for the evaluation of the training loss is bundeld as a Batch type.

deepqmc.types.Batch#

alias of tuple[pytree_dataclass, Array, dict | None]