API#

This is the documentation of the API of the JAX implementation of the deepqmc package.

This implementation uses the JAX library, the documentation for which can be found here:

class deepqmc.Molecule(*, coords, charges, charge, spin, unit='bohr', data=None)[source]#

Represents a molecule.

The array-like arguments accept anything that can be transformed to jax.numpy.DeviceArray.

Parameters:
  • coords (float, (\(N_\text{nuc}\), 3), a.u.) – nuclear coordinates as rows

  • charges (int, (\(N_\text{nuc}\))) – atom charges

  • charge (int) – total charge of a molecule

  • spin (int) – total spin multiplicity

classmethod from_name(name, **kwargs)[source]#

Create a molecule from a database of named molecules.

The available names are in Molecule.all_names.

class deepqmc.hamil.Hamiltonian[source]#

Base class for all Hamiltonian operators.

local_energy(wf)[source]#

Return a function that calculates the local energy of the wave function.

Parameters:

wf (WaveFunction) – the wave function ansatz.

Returns:

a function that evaluates the local energy of wf at r.

Return type:

Callable[r, ...]

class deepqmc.hamil.MolecularHamiltonian(*, mol, pp_type=None, pp_mask=None, elec_std=1.0)[source]#

Hamiltonian of non-relativistic molecular systems.

The system consists of nuclei with fixed positions and electrons moving around them. The total energy is defined as the sum of the nuclear-nuclear and electron-electron repulsion, the nuclear-electron attraction, and the kinetic energy of the electrons: \(E=V_\text{nuc-nuc} + V_\text{el-el} + V_\text{nuc-el} + E_\text{kin}\).

Parameters:
  • mol (Molecule) – the molecule to consider

  • pp_type (str) – If set, use the appropriate pseudopotential. The string is passed to pyscf.gto.M() as 'ecp' argument. Supports pseudopotentials that are implemented in the pyscf package, e.g. 'bfd' [Burkatzki et al. 2007] or 'ccECP' [Bennett et al. 2017].

  • pp_mask (list, (\(N_\text{nuc}\))) – list of True and False values specifying whether to use a pseudopotential for each nucleus

  • elec_std (float) – optional, a default value of the scaling factor

  • nuclei. (of the spread of electrons around the) –

as_pyscf(coords)[source]#

Return nuclear charges and coordinates in a format pyscf can parse.

Parameters:

coords (jax.Array) – nuclear coordinates, shape [n_nuc, 3].

init_sample(rng, R, n, elec_std=None)[source]#

Guess some initial electron positions.

Tries to make an educated guess about plausible initial electron configurations. Places electrons according to normal distributions centered on the nuclei. If the molecule is not neutral, extra electrons are placed on or removed from random nuclei. The resulting configurations are usually very crude, a subsequent, thorough equilibration is needed.

Parameters:
  • rng (jax.random.PRNGKey) – key used for PRNG.

  • R (float, (\(N_\text{nuc}\), 3)) – nuclear coordinates of a single molecular geometry

  • n (int) – the number of configurations to generate. electrons around the nuclei.

Haiku#

The JAX implementation uses the haiku library to create, train and evaluate neural network models. Some additional neural network functionality is implemented in the package and documented here.

class deepqmc.hkext.Identity(*args, **kwargs)[source]#

Represent the identity operation.

class deepqmc.hkext.MLP(*args: Any, **kwargs: Any)[source]#

Represent a multilayer perceptron.

Parameters:
  • in_dim (int) – the input dimension.

  • out_dim (int) – the output dimension.

  • residual (bool) – whether to include a residual connection

  • name (str) – optional, the name of the network.

  • hidden_layers (tuple) – optional, either (‘log’, \(N_\text{layers}\)), in which case the network will have \(N_\text{layers}\) layers with logarithmically changing widths, or a tuple of ints specifying the width of each layer.

  • bias (str) –

    optional, specifies which layers should have a bias term. Possible values are

    • True: all layers will have a bias term

    • False: no layers will have a bias term

    • 'not_last': all but the last layer will have a bias term

  • last_linear (bool) – optional, if True the activation function is not applied to the activation of the last layer.

  • activation (Callable) – optional, the activation function.

  • w_init (str or Callable) –

    optional, specifies the initialization of the linear weights. Possible string values are:

    • 'default': the default haiku initialization method is used.

    • 'deeperwin': the initialization method of the deeperwin

      package is used.

class deepqmc.hkext.ResidualConnection(*, normalize)[source]#

Represent a residual connection between pytrees.

The residual connection is only added if inp and update have the same shape.

Parameters:

normalize (-) – if True the sum of inp and update is normalized with sqrt(2).

class deepqmc.hkext.SumPool(out_dim, name=None)[source]#

Represent a global sum pooling operation.

deepqmc.hkext.ssp(x)[source]#

Compute the shifted softplus activation function.

Computes the elementwise function \(\text{softplus}(x)=\log(1+\text{e}^x)+\log\frac{1}{2}\)

Training and evaluation#

deepqmc.train.train(hamil, ansatz, opt, sampler, steps, seed, electron_batch_size, molecule_batch_size=1, mols=None, workdir=None, train_state=None, init_step=0, max_restarts=3, max_eq_steps=1000, pretrain_steps=None, pretrain_kwargs=None, pretrain_sampler=None, opt_kwargs=None, fit_kwargs=None, chkptdir=None, chkpts_kwargs=None, metric_logger=None)[source]#

Train or evaluate a JAX wave function model.

It initializes and equilibrates the MCMC sampling of the wave function ansatz, then optimizes or samples it using the variational principle. It optionally saves checkpoints and rewinds the training/evaluation if an error is encountered. If an optimizer is supplied, the Ansatz is optimized, otherwise the Ansatz is only sampled.

Parameters:
  • hamil (Hamiltonian) – the Hamiltonian of the physical system.

  • ansatz (WaveFunction) – the wave function Ansatz.

  • opt (kfac_jax or optax optimizers, str or None) –

    the optimizer. Possible values are:

    • kfac_jax.Optimizer: the partially initialized KFAC optimizer

      is used

    • an optax optimizer instance: the supplied optax

      optimizer is used.

    • str: the name of the optimizer to use ('kfac' or an

      optax optimizer name). Arguments to the optimizer can be passed in opt_kwargs.

    • None: no optimizer is used, e.g. the evaluation of the Ansatz

      is performed.

  • sampler (Sampler) – a sampler instance

  • steps (int) – number of optimization steps.

  • seed (int) – the seed used for PRNG.

  • electron_batch_size (int) – the number of electron samples considered in a batch

  • molecule_batch_size (int) – optional, the number of molecules considered in a batch. Only needed for transferable training.

  • mols (Sequence(Molecule)) – optional, a sequence of molecules to consider for transferable training. If None the default molecule from hamil is used.

  • workdir (str) – optional, path, where results should be saved.

  • train_state (TrainState) – optional, training checkpoint to restore training or run evaluation.

  • init_step (int) – optional, initial step index, useful if calculation is restarted from checkpoint saved on disk.

  • max_restarts (int) – optional, the maximum number of times the training is retried before a NaNError is raised.

  • max_eq_steps (int) – optional, maximum number of equilibration steps if not detected earlier.

  • pretrain_steps (int) – optional, the number of pretraining steps wrt. to the Baseline wave function obtained with pyscf.

  • pretrain_kwargs (dict) – optional, extra arguments for pretraining.

  • opt_kwargs (dict) – optional, extra arguments passed to the optimizer.

  • fit_kwargs (dict) – optional, extra arguments passed to the fit_wf() function.

  • chkptdir (str) – optional, path, where checkpoints should be saved. Checkpoints are only saved if workdir is not None. Default: data:workdir.

  • chkpts_kwargs (dict) – optional, extra arguments for checkpointing.

  • metric_logger – optional, an object that consumes metric logging information. If not specified, the default ~.log.TensorboardMetricLogger is used to create tensorboard logs.

Sampling#

class deepqmc.sampling.Sampler[source]#

Base class for all QMC samplers.

class deepqmc.sampling.DecorrSampler(*, length)[source]#

Insert decorrelating steps into chained samplers.

This sampler cannot be used as the last element of a sampler chain.

Parameters:

length (int) – the samples will be taken in every length MCMC step, that is, length \(-1\) decorrelating steps are inserted.

class deepqmc.sampling.LangevinSampler(hamil, *, tau=1.0, target_acceptance=0.57, max_age=None)[source]#

Langevin Monte Carlo sampler.

Derived from MetropolisSampler.

class deepqmc.sampling.MetropolisSampler(hamil, *, tau=1.0, target_acceptance=0.57, max_age=None)[source]#

Metropolis–Hastings Monte Carlo sampler.

The sample() method of this class returns electron coordinate samples from the distribution defined by the square of the sampled wave function.

Parameters:
  • hamil (jax.hamil.Hamiltonian) – the Hamiltonian of the physical system

  • tau (float) – optional, the proposal step size scaling factor. Adjusted during every step if target_acceptance is specified.

  • target_acceptance (float) – optional, if specified the proposal step size will be scaled such that the ratio of accepted proposal steps approaches target_acceptance.

  • max_age (int) – optional, if specified the next proposed step will always be accepted for a walker that hasn’t moved in the last max_age steps.

class deepqmc.sampling.ResampledSampler(*, period=None, threshold=None)[source]#

Add resampling to chained samplers.

This sampler cannot be used as the last element of a sampler chain. The resampling is performed by accumulating weights on each MCMC walker in each step. Based on a fixed resampling period period and/or a threshold threshold on the normalized effective sample size the walker positions are sampled according to the multinomial distribution defined by these weights, and the weights are reset to one. Either period or threshold have to be specified.

Parameters:
  • period (int) – optional, if specified the walkers are resampled every period MCMC steps.

  • threshold (float) – optional, if specified the walkers are resampled if the effective sample size normalized with the batch size is below threshold.

deepqmc.sampling.chain(*samplers)[source]#

Combine multiple sampler types, to create advanced sampling schemes.

For example chain(DecorrSampler(10),MetropolisSampler(hamil, tau=1.)) will create a MetropolisSampler, where the samples are taken from every 10th MCMC step. The last element of the sampler chain has to be either a MetropolisSampler or a LangevinSampler.

Parameters:

samplers (Sampler) – one or more sampler instances to combine.

Returns:

the combined sampler.

Return type:

Sampler

Wave functions#

class deepqmc.wf.WaveFunction(*args: Any, **kwargs: Any)[source]#

Base class for all trial wave functions.

Shape:
  • Input, \(\mathbf r\), (float, \((N,3)\), a.u.): particle

    coordinates

  • Output1, \(\ln|\psi(\mathbf r)|\) (float):

  • Output2, \(\operatorname{sgn}\psi(\mathbf r)\) (float):

class deepqmc.wf.NeuralNetworkWaveFunction(*args: Any, **kwargs: Any)[source]#

Implements the neural network wave function.

Configuration files to obtain the PauliNet [HermannNC20], FermiNet [PfauPRR20], DeepErwin [Gerard22] and PsiFormer [Glehn22] architectures are provided. For a detailed description of the implemented architecture see [Schaetzle23].

Parameters:
  • hamil (~MolecularHamiltonian) – the Hamiltonian of the system.

  • omni_factory (Callable) – creates the omni net.

  • envelope (ExponentialEnvelopes) – the orbital envelopes.

  • backflow_op (Callable) – specifies how the backflow is applied to the orbitals.

  • n_determinants (int) – specifies the number of determinants

  • full_determinant (bool) – if False, the determinants are factorized into spin-up and spin-down parts.

  • cusp_electrons (bool) – whether to apply the electronic cusp correction.

  • cusp_alpha (float) – the \(\alpha\) factor of the electronic cusp correction.

  • backflow_transform (str) –

    describes the backflow transformation. Possible values:

    • 'mult': the backflow is a multiplicative factor

    • 'add': the backflow is an additive term

    • 'both': the backflow consist of a multiplicative factor

      and an additive term

  • conf_coeff (Callable) – returns a function that combines the determinants to obtain the WF value

Graph neural networks#

A graph neural network is the most important component of the neural network wave function ansatz. This module implements a general gnn framework, that can be configured to obtain a variety of different ansatzes.

Graphs#

This submodule implements the basic functionality for working with graphs.

deepqmc.gnn.graph.GraphEdgeBuilder(mask_self, offsets, mask_vals)[source]#

Create a function that builds graph edges.

Parameters:
  • filter_self (bool) – whether to filter edges between nodes of the same index.

  • offsets ((int, int)) – node index offset to be added to the returned sender and receiver node indices respectively.

  • mask_vals ((int, int)) – if occupancy_limit is larger than the number of valid edges, the remaining node indices will be filled with these values for the sender and receiver nodes respectively (i.e. the value to pad the node index arrays with).

  • feature_callback (Callable) – a function that takes the sender positions, receiver positions, sender node indices and receiver node indices and returns some data (features) computed for the edges.

deepqmc.gnn.graph.GraphUpdate(aggregate_edges_for_nodes_fn, update_nodes_fn=None, update_edges_fn=None)[source]#

Create a function that updates a graph.

The update function is tailored to be used in GNNs.

Parameters:
  • aggregate_edges_for_nodes_fn (bool) – whether to perform the aggregation of edges for nodes.

  • update_nodes_fn (Callable) – optional, function that updates the nodes.

  • update_edges_fn (Callable) – optional, function that updates the edges.

deepqmc.gnn.graph.MolecularGraphEdgeBuilder(n_nuc, n_up, n_down, edge_types, *, self_interaction)[source]#

Create a function that builds many types of molecular edges.

Parameters:
  • n_nuc (int) – number of nuclei.

  • n_up (int) – number of spin-up electrons.

  • n_down (int) – number of spin-down electrons.

  • edge_types (List[str]) –

    list of edge type names to build. Possible names are:

    • 'nn': nuclei->nuclei edges

    • 'ne': nuclei->electrons edges

    • 'en': electrons->nuclei edges

    • 'same': edges between same-spin electrons

    • 'anti': edges between opposite-spin electrons

    • 'up': edges going from spin-up electrons to all electrons

    • 'down': edges going from spin-down electrons to all electrons

  • self_interaction (bool) – whether edges between a particle and itself are considered

class deepqmc.gnn.edge_features.CombinedEdgeFeature(*, features)[source]#

Combine multiple edge features.

Parameters:

features (Sequence) – a Sequence of edge feature objects to combine.

class deepqmc.gnn.edge_features.DifferenceEdgeFeature(*, log_rescale=False)[source]#

Return the difference vector as the edge features.

class deepqmc.gnn.edge_features.DistancePowerEdgeFeature(*, powers, eps=None, log_rescale=False)[source]#

Return powers of the distance as edge features.

class deepqmc.gnn.edge_features.EdgeFeature[source]#

Base class for all edge features.

class deepqmc.gnn.edge_features.GaussianEdgeFeature(*, n_gaussian, radius, offset)[source]#

Expand the distance in a Gaussian basis.

Parameters:
  • n_gaussian (int) – the number of gaussians to use, consequently the length of the feature vector

  • radius (float) – the radius within which to place gaussians

  • offset (bool) – whether to offset the position of the first Gaussian from zero.

Electron GNN#

This submodule provides the ElectronGNN architecture for defining neural network parametrized functions acting on graphs of electrons and nuclei.

class deepqmc.gnn.electron_gnn.ElectronEmbedding(n_nuc, n_up, n_down, embedding_dim, n_elec_types, elec_types, *, positional_embeddings, use_spin, project_to_embedding_dim)[source]#

Create initial embeddings for electrons.

Parameters:
  • n_nuc (int) – the number of nuclei.

  • n_up (int) – the number of spin up electrons.

  • n_down (int) – the number of spin down electrons.

  • embedding_dim (int) – the desired length of the embedding vectors.

  • n_elec_types (int) –

    the number of electron types to differentiate. Usual values are:

    • 1: treat all electrons as indistinguishable. Note that electrons

      with different spins can still become distinguishable during the later embedding update steps of the GNN.

    • 2: treat spin up and spin down electrons as distinguishable already

      in the initial embeddings.

  • elec_types (jax.Array) – an integer array with length equal to the number of electrons, with entries between 0 and n_elec_types. Specifies the type for each electron.

  • positional_embeddings (Union[Literal[False],dict]) – if not False, a dict with edge types as keys, and edge features as values. Specifies the edge types and edge features to use when constructing the positional initial electron embeddings.

  • use_spin (bool) – only relevant if positional_embeddings is not False, if True, concatenate the spin of the given electron after the positional embedding features.

  • project_to_embedding_dim (bool) – only relevant if positional_embeddings is not False, if True, use a linear layer to project the initial embeddings to have length embedding_dim.

class deepqmc.gnn.electron_gnn.ElectronGNN(hamil, embedding_dim, *, n_interactions, edge_features, self_interaction, two_particle_stream_dim, nuclei_embedding, electron_embedding, layer_factory, ghost_coords=None)[source]#

A neural network acting on graphs defined by electrons and nuclei.

Derived from GraphNeuralNetwork.

Parameters:
  • hamil (MolecularHamiltonian) – the Hamiltonian of the system on which the graph is defined.

  • embedding_dim (int) – the length of the electron embedding vectors.

  • n_interactions (int) – number of message passing interactions.

  • edge_features (dict) –

    a dict of functions for each edge type, embedding the interparticle differences. Valid keys are:

    • 'ne': for nucleus-electron edges

    • 'nn': for nucleus-nucleus edges

    • 'same': for same spin electron-electron edges

    • 'anti': for opposite spin electron-electron edges

    • 'up': for edges going from spin up electrons to all electrons

    • 'down': for edges going from spin down electrons to all electrons

  • self_interaction (bool) – whether to consider edges where the sender and receiver electrons are the same.

  • two_particle_stream_dim (int) – the feature dimension of the two particle streams. Only active if deep_features are used.

  • nuclei_embedding (Union[None,NucleiEmbedding]) – optional, the instance responsible for creating the initial nuclear embeddings. Set to None if nuclear embeddings are not needed.

  • electron_embedding (ElectronEmbedding) – the instance that creates the initial electron embeddings.

  • layer_factory (Callable) – a callable that generates a layer of the GNN.

  • ghost_coords – optional, specifies the coordinates of one or more ghost atoms, useful for breaking spatial symmetries of the nuclear geometry.

edge_factory(phys_conf)[source]#

Compute all the graph edges used in the GNN.

class deepqmc.gnn.electron_gnn.ElectronGNNLayer(n_interactions, ilayer, n_nuc, n_up, n_down, embedding_dim, edge_types, self_interaction, node_data, two_particle_stream_dim, *, one_particle_residual, two_particle_residual, deep_features, update_features, update_rule, subnet_factory=None, subnet_factory_by_lbl=None)[source]#

The message passing layer of ElectronGNN.

Derived from MessagePassingLayer.

Parameters:
  • one_particle_residual – whether a residual connection is used when updating the one particle embeddings, either False, or an instance of ResidualConnection.

  • two_particle_residual – whether a residual connection is used when updating the two particle embeddings, either False, or an instance of ResidualConnection.

  • deep_features – if False, the edge features are not updated throughout the GNN layers, if shared than in each layer a single MLP (u) is used to update all edge types, if separate then in each layer separate MLPs are used to update the different edge types.

  • update_features (List[UpdateFeature]) – a list of partially initialized update feature classes to use when computing the update features of the one particle embeddings. For more details see the documentation of deepqmc.gnn.update_features.

  • update_rule (str) –

    how to combine the update features for the update of the one particle embeddings. Possible values:

    • 'concatenate': run concatenated features through MLP

    • 'featurewise': apply different MLP to each feature channel and sum

    • 'featurewise_shared': apply the same MLP across feature channels

    • 'sum': sum features before sending through an MLP

    note that 'sum' and 'featurewise_shared' imply features of same size.

  • subnet_factory (Callable) – A function that constructs the subnetworks of the GNN layer.

  • subnet_factory_by_lbl (dict) – optional, a dictionary of functions that construct subnetworks of the GNN layer. If both this and subnet_factory is specified, the specified values of subnet_factory_by_lbl will take precedence. If some keys are missing, the default value of subnet_factory will be used in their place. Possible keys are: (w, h, g or u).

class deepqmc.gnn.electron_gnn.NucleiEmbedding(charges, n_atom_types, *, embedding_dim, atom_type_embedding, subnet_type)[source]#

Create initial embeddings for nuclei.

Parameters:
  • charges (jnp.ndarray) – the nuclear charges of the molecule.

  • n_atom_types (int) – the number of different atom types in the molecule.

  • embedding_dim (int) – the length of the output embedding vector

  • atom_type_embedding (bool) – if True, initial embeddings are the same for atoms of the same type (nuclear charge), otherwise they are different for all nuclei.

  • subnet_type (str) – the type of subnetwork to use for the embedding generation: - 'mlp': an MLP is used - 'embed': a haiku.Embed block is used

Update Features#

This submodule implements some common ways to compute update features for the node embeddings from the current node and edge embeddings. Instances of the below classes are callable, they take as input the current node and edge representations, and output a list of update features to be used for updating the node representations.

class deepqmc.gnn.update_features.ConvolutionUpdateFeature(*args, edge_types, normalize, w_factory, h_factory, w_for_ne=True)[source]#

The convolution of node and edge embeddings as an update feature.

Returns the convolution of the node and edge embeddings for various edge types as separate update features.

class deepqmc.gnn.update_features.EdgeSumUpdateFeature(*args, edge_types, normalize)[source]#

The (normalized) sum of the edge embeddings as an update feature.

Returns the (normalized) sum of the edge embeddings for various edge types as separate update features.

class deepqmc.gnn.update_features.NodeAttentionUpdateFeature(*args, num_heads, mlp_factory, attention_residual, mlp_residual)[source]#

Create a single update feature by attenting over the nodes.

Returns the Psiformer update feature based on attention over the nodes.

class deepqmc.gnn.update_features.NodeSumUpdateFeature(*args, node_types, normalize)[source]#

The (normalized) sum of the node embeddings as an update feature.

Returns the (normalized) sum of the electron embeddings from the previous layer as a single update feature.

class deepqmc.gnn.update_features.ResidualUpdateFeature(n_up, n_down, two_particle_stream_dim, node_edge_mapping)[source]#

Residual update feature.

Returns the unchanged electron embeddings from the previous layer as a single update feature.

class deepqmc.gnn.update_features.UpdateFeature(n_up, n_down, two_particle_stream_dim, node_edge_mapping)[source]#

Base class for all update features.