API#
This is the documentation of the API of the deepqmc
package.
This implementation uses the JAX library, the documentation for which can be found here:
- class deepqmc.molecule.Molecule(*, coords, charges, charge, spin, unit='bohr', data=None)[source]#
Represents a molecule.
The array-like arguments accept anything that can be transformed to
jax.Array
.- Parameters:
coords (jax.Array | list[float]) – nuclear coordinates ((\(N_\text{nuc}\), 3), a.u.) as rows
charges (jax.Array | list[int | float]) – atom charges (\(N_\text{nuc}\))
charge (int) – total charge of a molecule
spin (int) – total spin multiplicity
unit (str) – units of the coordinates, either ‘bohr’ or ‘angstrom’
data (dict) – additional data stored with the molecule
- class deepqmc.hamil.LaplacianFactory(*args, **kwargs)[source]#
Protocol class for Laplacian factories.
A Laplacian factory takes as input a function and returns a function that computes the laplacian and gradient of the input function
- class deepqmc.hamil.MolecularHamiltonian(*, mol, ecp_type=None, ecp_mask=None, elec_std=1.0, laplacian_factory=<function laplacian>)[source]#
Hamiltonian of non-relativistic molecular systems.
The system consists of nuclei with fixed positions and electrons moving around them. The total energy is defined as the sum of the nuclear-nuclear and electron-electron repulsion, the nuclear-electron attraction, and the kinetic energy of the electrons: \(E=V_\text{nuc-nuc} + V_\text{el-el} + V_\text{nuc-el} + E_\text{kin}\).
- Parameters:
mol (Molecule) – the molecule to consider
ecp_type (str) – If set, use the appropriate effective core potential (ECP). The string is passed to
pyscf.gto.M()
as'ecp'
argument. Supports ECPs that are implemented in the pyscf package, e.g.'bfd'
[Burkatzki et al. 2007] or'ccECP'
[Bennett et al. 2017].ecp_mask (list[bool]) – list of True and False values (\(N_\text{nuc}\)) specifying whether to use an ECP for each nucleus.
elec_std (float) – optional, a default value of the scaling factor of the spread of electrons around the nuclei.
laplacian_factory (LaplacianFactory) – creates a function that returns a tuple containing the laplacian and gradient of the wave function.
- as_pyscf(*, coords=None)[source]#
Return the hamiltonian parameters in format pyscf can parse.
- Parameters:
coords (jax.Array) – optional, nuclear coordinates (\(N_\text{nuc}\), 3).
- init_sample(rng, R, n, elec_std=None)[source]#
Guess some initial electron positions.
Tries to make an educated guess about plausible initial electron configurations. Places electrons according to normal distributions centered on the nuclei. If the molecule is not neutral, extra electrons are placed on or removed from random nuclei. The resulting configurations are usually very crude, a subsequent, thorough equilibration is needed.
- Parameters:
- Returns:
- initial electron and nuclei
configurations
- Return type:
Haiku#
The JAX implementation uses the haiku library to create, train and evaluate neural network models. Some additional neural network functionality is implemented in the package and documented here.
- class deepqmc.hkext.GLU(*args, **kwargs)[source]#
Gated Linear Unit.
- Parameters:
out_dim (int) – the output dimension.
name (str) – optional, the name of the network.
bias (bool) – optional, whether to include a bias term.
layer_norm_before (bool) – optional, whether to apply layer normalization before the GLU operation.
activation (Callable) – default is sigmoid, the activation function.
b_init (Callable) – default is zeros, the initialization function for the bias term.
- class deepqmc.hkext.MLP(*args, **kwargs)[source]#
Represent a multilayer perceptron.
- Parameters:
out_dim (int) – the output dimension.
name (str) – optional, the name of the network.
hidden_layers (tuple) – optional, either (‘log’, \(N_\text{layers}\)), in which case the network will have \(N_\text{layers}\) layers with logarithmically changing widths, or a tuple of ints specifying the width of each layer.
optional, specifies which layers should have a bias term. Possible values are
last_linear (bool) – optional, if
True
the activation function is not applied to the activation of the last layer.activation (Callable) – optional, the activation function.
init (str | Callable) –
optional, specifies the initialization of the linear weights. Possible string values are:
'default'
: the default haiku initialization method is used.'ferminet'
: the initialization method of theferminet
package is used.
'deeperwin'
: the initialization method of thedeeperwin
package is used.
- class deepqmc.hkext.ResidualConnection(*, normalize)[source]#
Represent a residual connection between pytrees.
The residual connection is only added if
inp
andupdate
have the same shape.- Parameters:
normalize (-) – if
True
the sum ofinp
andupdate
is normalized withsqrt(2)
.
Training and evaluation#
- class deepqmc.types.TrainState(sampler, params, opt)[source]#
Represent the current state of the training.
- opt#
Alias for field number 2
- params#
Alias for field number 1
- sampler#
Alias for field number 0
- deepqmc.train.train(hamil, ansatz, opt, sampler_factory, steps, seed, electron_batch_size, molecule_batch_size=1, electronic_states=1, mols=None, workdir=None, train_state=None, init_step=0, max_restarts=3, max_eq_steps=1000, eq_allow_early_stopping=True, pretrain_steps=None, pretrain_kwargs=None, chkpt_constructor=None, metric_logger_constructor=None, h5_logger_constructor=None, merge_keys=None, loss_function_factory=None, observable_monitors=None)[source]#
Train or evaluate a JAX wave function model.
It initializes and equilibrates the MCMC sampling of the wave function ansatz, then optimizes or samples it using the variational principle. It optionally saves checkpoints and rewinds the training/evaluation if an error is encountered. If an optimizer is supplied, the Ansatz is optimized, otherwise the Ansatz is only sampled.
- Parameters:
hamil (MolecularHamiltonian) – the Hamiltonian of the physical system.
ansatz (Ansatz) – the wave function Ansatz.
opt (
kfac_jax
oroptax
optimizers |None
) –the optimizer. Possible values are:
kfac_jax.Optimizer
:the partially initialized KFAC optimizer is used
- an
optax
optimizer instance: the supplied
optax
optimizer is used.
- an
None
:no optimizer is used, e.g. the evaluation of the Ansatz is performed.
sampler_factory (Callable) – a function that returns a Sampler instance
steps (int) – number of optimization steps.
seed (int) – the seed used for PRNG.
electron_batch_size (int) – the number of electron samples considered in a batch
molecule_batch_size (int) – optional, the number of molecules considered in a batch. Only needed for transferable training.
electronic_states (int) – optional, the number of electronic states to consider.
mols (list[Molecule]) – optional, a sequence of molecules to consider for transferable training. If None the default molecule from hamil is used.
workdir (str) – optional, path, where results should be saved.
train_state (TrainState) – optional, training checkpoint to restore training or run evaluation.
init_step (int) – optional, initial step index, useful if calculation is restarted from checkpoint saved on disk.
max_restarts (int) – optional, the maximum number of times the training is retried before a
NaNError
is raised.max_eq_steps (int) – optional, maximum number of equilibration steps if not detected earlier.
eq_allow_early_stopping (bool) – default
True
, whether to allow the equilibration to stop early when the equilibration criterion has been met.pretrain_steps (int) – optional, the number of pretraining steps wrt. to the Baseline wave function obtained with pyscf.
pretrain_kwargs (dict) – optional, extra arguments for pretraining.
chkpt_constructor (Type[CheckpointStore]) – optional, an object that saves training checkpoints to the working directory.
metric_logger_constructor (Type[MetricLogger]) – optional, an object that consumes metric logging information. If not specified, the default
TensorboardMetricLogger
is used to create tensorboard logs.h5_logger_constructor (Type[H5Logger]) – optional, an object that consumes metric logging information. If not specified, the default
H5Logger
is used to write comprehensive training statistics to an h5file.merge_keys (list[str]) – optional, list of strings for selecting parameters to be shared across electronic states. Matching merge keys with (substrings of) parameter keys.
loss_function_factory (LossFunctionFactory) – optional, a callable returning a suitable loss function for the optimization.
observable_monitors (list[ObservableMonitor]) – optional, list of observable monitors to be evaluated during training or evaluation.
Sampling#
- class deepqmc.sampling.base.ElectronSampler(*args, **kwargs)[source]#
Protocol for
ElectronSampler
objects.ElectronSampler
objects implement Markov chain samplers for the electron positions. The samplers are assumed to implement a batch of walkers for a single electronic state on a single molecule and may be vmapped to fit the respective context they are used in. Electron samplers can be combined withchain()
.- init(rng, params, n, R)[source]#
Initializes the sampler state.
- Parameters:
- Returns:
the sampler state holding electron positions and data about the sampler trajectory.
- Return type:
- sample(rng, state, params, R)[source]#
Propagates the sampler state.
- Parameters:
rng (KeyArray) – an rng key for the proposal of electron positions.
state (SamplerState) – the state of the sampler from the previous step.
params (Params) – the parameters of the wave function that is being sampled.
R (jax.Array) – the nuclei positions of the molecular configuration.
- Returns:
tuple[
SamplerState
,PhysicalConfiguration
,Stats
]: the new sampler state, a physical configuration and statistics about the sampling trajectory.
- update(state, params, R)[source]#
Updates the sampler state.
The sampler state is updated to account for changes in the wave function due to a parameter update.
- Parameters:
state (SamplerState) – the state of the sampler before parameter update.
params (Params) – the new parameters of the wave function.
R (jax.Array) – the nuclei positions of the molecular configuration.
- Returns:
the updated sampler state holding electron positions and data about the sampler trajectory.
- Return type:
- class deepqmc.sampling.DecorrSampler(*, length)[source]#
Insert decorrelating steps into chained samplers.
This sampler cannot be used as the last element of a sampler chain.
- Parameters:
length (int) – the samples will be taken in every
length
MCMC step, that is,length
\(-1\) decorrelating steps are inserted.
- class deepqmc.sampling.LangevinSampler(hamil, wf, *, tau=1.0, target_acceptance=0.57, max_age=None)[source]#
Metropolis adjusted Langevin Monte Carlo sampler.
Derived from
MetropolisSampler
.- Parameters:
hamil (MolecularHamiltonian) – the Hamiltonian of the physical system.
wf – the
apply
method of thehaiku
transformed ansatz object.tau (float) – optional, the proposal step size scaling factor. Adjusted during every step if
target_acceptance
is specified.target_acceptance (float) – optional, if specified the proposal step size will be scaled such that the ratio of accepted proposal steps approaches
target_acceptance
.max_age (int) – optional, if specified the next proposed step will always be accepted for a walker that hasn’t moved in the last
max_age
steps.
- class deepqmc.sampling.MetropolisSampler(hamil, wf, *, tau=1.0, target_acceptance=0.57, max_age=None)[source]#
Metropolis–Hastings Monte Carlo sampler.
The
sample()
method of this class returns electron coordinate samples from the distribution defined by the square of the sampled wave function.- Parameters:
hamil (MolecularHamiltonian) – the Hamiltonian of the physical system.
wf – the
apply
method of thehaiku
transformed ansatz object.tau (float) – optional, the proposal step size scaling factor. Adjusted during every step if
target_acceptance
is specified.target_acceptance (float) – optional, if specified the proposal step size will be scaled such that the ratio of accepted proposal steps approaches
target_acceptance
.max_age (int) – optional, if specified the next proposed step will always be accepted for a walker that hasn’t moved in the last
max_age
steps.
- class deepqmc.sampling.MoleculeIdxSampler(rng, n_mols, batch_size, shuffle=False)[source]#
Sample molecule indexes for transferable training.
- class deepqmc.sampling.MultiNuclearGeometrySampler(elec_sampler, nuc_sampler, warp_elec_fn, update_nuc_period, elec_equilibration_steps)[source]#
This sampler samples from multiple nuclear geometries in parallel.
- Parameters:
elec_sampler (ElectronSampler) – the electronic sampler to use
nuc_sampler (NucleiSampler) – the nuclei sampler to use.
warp_elec_fn (ElectronWarp) – the function that warps the electrons to the new nuclear geometry.
update_nuc_period (int) – optional, the number of steps between nuclear updates.
elec_equilibration_steps (int) – optional, the number of steps to equilibrate the electronic state after a nuclear update.
- class deepqmc.sampling.ResampledSampler(*, period=None, threshold=None)[source]#
Add resampling to chained samplers.
This sampler cannot be used as the last element of a sampler chain. The resampling is performed by accumulating weights on each MCMC walker in each step. Based on a fixed resampling period
period
and/or a thresholdthreshold
on the normalized effective sample size the walker positions are sampled according to the multinomial distribution defined by these weights, and the weights are reset to one. Eitherperiod
orthreshold
have to be specified.
- deepqmc.sampling.chain(*samplers)[source]#
Combine multiple sampler types, to create advanced sampling schemes.
For example
chain(DecorrSampler(10),MetropolisSampler(hamil, tau=1.))
will create aMetropolisSampler
, where the samples are taken from every 10th MCMC step. The last element of the sampler chain has to be either aMetropolisSampler
or aLangevinSampler
.- Parameters:
samplers (ElectronSampler) – one or more sampler instances to combine.
- Returns:
the combined sampler.
- Return type:
- deepqmc.sampling.combine_samplers(samplers, hamil, wf)[source]#
Combine samplers to create more advanced sampling schemes.
- Parameters:
samplers (list[ElectronSampler]) – one or more sampler instances to combine.
hamil (MolecularHamiltonian) – the molecular Hamiltonian.
wf (ParametrizedWaveFunction) – the wave function to sample.
Optimizers#
- class deepqmc.optimizer.Optimizer(loss_and_grad_fn, merge_keys=None)[source]#
Protocol for
Optimizer
objects.- init(rng, params, batch)[source]#
Initialize the optimizer state.
- Parameters:
rng (KeyArray) – the RNG key used to initialize random components the of optimizer state.
params (Params) – the parameters of the wave function ansatz/ansatzes to be optimized during training.
batch (Batch) – a tuple containing a physical configuration, a set of sample weights and auxiliary data.
- Returns:
the initial state of the optimizer
- Return type:
OptState
- step(rng, params, opt_state, batch)[source]#
Perform an optimization step.
- Parameters:
rng (KeyArray) – the RNG key for the optimizer update.
params (Params) – the current parameters of the wave function ansatz/ansatzes.
opt_state (OptState) – the current state of the optimizer
batch (Batch) – a tuple containing a physical configuration, a set of sample weights and auxiliary data.
- Returns:
tuple[~deepqmc.types.Params, ~deepqmc.types.OptState, ~deepqmc.types.Energy, jax.Array | None, ~deepqmc.types.Stats]: the new model parameters, an updated optimizer state, the energies obtained during the evaluation of the loss function, if applicable the wave function ratios obtained during the evaluation of the loss dunction and further statistics.
Wave functions#
- class deepqmc.types.Psi(sign, log)[source]#
Represent wave function values.
The sign and log of the absolute value of the wave function are stored.
- log#
Alias for field number 1
- sign#
Alias for field number 0
- deepqmc.types.ParametrizedWaveFunction#
alias of
Callable
[[MutableMapping
,pytree_dataclass
],Psi
]
- class deepqmc.types.Ansatz(*args, **kwargs)[source]#
Protocol for ansatz objects.
Ansatz
objects represent a parametrized wave function Ansatz. New types of Ansatzes should implement this protocol to be compatible with the DeepQMC software suite. It is assumed that Ansatzes take as input aPhysicalConfiguration
for a single sample of electron and nuclei configuration. To handle batches of samples, e.g. during training, the Ansatz isvmap
-ed automatically by DeepQMC.- apply(params, phys_conf, return_mos=False)[source]#
Evaluate the Ansatz.
- Parameters:
params (Params) – the current parameters with which to evaluate the Ansatz.
phys_conf (PhysicalConfiguration) – a single sample on which to evaluate the Ansatz.
return_mos (bool) – whether to return the many-body orbitals instead of the wave function.
- Returns:
the value of the wave function.
- Return type:
- init(rng, phys_conf)[source]#
Initialize the parameters of the Ansatz.
- Parameters:
rng (KeyArray) – the RNG key used to generate the initial parameters.
phys_conf (PhysicalConfiguration) – a dummy input to the network of a single electron and nuclei configuration. The value of this can be anything, only its shape information is read.
- Returns:
the initial parameters of the Ansatz.
- Return type:
Params
- class deepqmc.wf.NeuralNetworkWaveFunction(hamil, *, omni_factory, envelope, backflow_op, n_determinants, full_determinant, cusp_electrons, cusp_nuclei, backflow_transform, conf_coeff)[source]#
Implements the neural network wave function.
Configuration files to obtain the PauliNet [HermannNC20], FermiNet [PfauPRR20], DeepErwin [Gerard22] and PsiFormer [Glehn22] architectures are provided. For a detailed description of the implemented architecture see [Schaetzle23].
- Parameters:
hamil (MolecularHamiltonian) – the Hamiltonian of the system.
omni_factory (Callable) – creates the omni net.
envelope (ExponentialEnvelopes) – the orbital envelopes.
backflow_op (Callable) – specifies how the backflow is applied to the orbitals.
n_determinants (int) – specifies the number of determinants
full_determinant (bool) – if
False
, the determinants are factorized into spin-up and spin-down parts.cusp_electrons (Callable) – constructor of the electronic cusp module.
cusp_nuclei (Callable) – constructor of the nuclear cusp module.
backflow_transform (str) –
describes the backflow transformation. Possible values:
'mult'
: the backflow is a multiplicative factor'add'
: the backflow is an additive term'both'
: the backflow consist of a multiplicative factorand an additive term
conf_coeff (Callable) – returns a function that combines the determinants to obtain the WF value
Graph neural networks#
A graph neural network is the most important component of the neural network wave function ansatz. This module implements a general gnn framework, that can be configured to obtain a variety of different ansatzes.
Graphs#
This submodule implements the basic functionality for working with graphs.
- deepqmc.gnn.graph.GraphEdgeBuilder(mask_self)[source]#
Create a function that builds graph edges.
- Parameters:
mask_self (bool) – whether to mask edges that begin and end in the same node.
- deepqmc.gnn.graph.GraphUpdate(aggregate_edges_for_nodes_fn, update_nodes_fn=None, update_edges_fn=None)[source]#
Create a function that updates a graph.
The update function is tailored to be used in GNNs.
- deepqmc.gnn.graph.MolecularGraphEdgeBuilder(n_nuc, n_up, n_down, edge_types, *, self_interaction)[source]#
Create a function that builds many types of molecular edges.
- Parameters:
n_nuc (int) – number of nuclei.
n_up (int) – number of spin-up electrons.
n_down (int) – number of spin-down electrons.
list of edge type names to build. Possible names are:
'nn'
: nuclei->nuclei edges'ne'
: nuclei->electrons edges'en'
: electrons->nuclei edges'same'
: edges between same-spin electrons'anti'
: edges between opposite-spin electrons'up'
: edges going from spin-up electrons to all electrons'down'
: edges going from spin-down electrons to all electrons
self_interaction (bool) – whether edges between a particle and itself are considered
- class deepqmc.gnn.edge_features.CombinedEdgeFeature(*, features)[source]#
Combine multiple edge features.
- Parameters:
features (list[EdgeFeature]) – a list of edge feature objects to combine.
- class deepqmc.gnn.edge_features.DifferenceEdgeFeature(*, log_rescale=False)[source]#
Return the difference vector as the edge features.
- Parameters:
log_rescale (bool) – whether to rescale the features by \(\log(1 + d) / d\) where \(d\) is the length of the edge.
- class deepqmc.gnn.edge_features.DistancePowerEdgeFeature(*, powers, eps=None, log_rescale=False)[source]#
Return powers of the distance as edge features.
- class deepqmc.gnn.edge_features.EdgeFeature(*args, **kwargs)[source]#
Base class for all edge features.
Electron GNN#
This submodule provides the ElectronGNN architecture for defining neural network parametrized functions acting on graphs of electrons and nuclei.
- class deepqmc.gnn.electron_gnn.ElectronEmbedding(n_nuc, n_up, n_down, embedding_dim, n_elec_types, elec_types, *, positional_embeddings, use_spin, project_to_embedding_dim)[source]#
Create initial embeddings for electrons.
- Parameters:
n_nuc (int) – the number of nuclei.
n_up (int) – the number of spin up electrons.
n_down (int) – the number of spin down electrons.
embedding_dim (int) – the desired length of the embedding vectors.
n_elec_types (int) –
the number of electron types to differentiate. Usual values are:
1
: treat all electrons as indistinguishable. Note that electronswith different spins can still become distinguishable during the later embedding update steps of the GNN.
2
: treat spin up and spin down electrons as distinguishable alreadyin the initial embeddings.
elec_types (jax.Array) – an integer array with length equal to the number of electrons, with entries between
0
andn_elec_types
. Specifies the type for each electron.positional_embeddings (dict) – optional, if not
None
, adict
with edge types as keys, and edge features as values. Specifies the edge types and edge features to use when constructing the positional initial electron embeddings.use_spin (bool) – only relevant if
positional_embeddings
is notFalse
, ifTrue
, concatenate the spin of the given electron after the positional embedding features.project_to_embedding_dim (bool) – only relevant if
positional_embeddings
is notFalse
, ifTrue
, use a linear layer to project the initial embeddings to have lengthembedding_dim
.
- class deepqmc.gnn.electron_gnn.ElectronGNN(hamil, embedding_dim, *, n_interactions, edge_features, self_interaction, two_particle_stream_dim, nuclei_embedding, electron_embedding, layer_factory, ghost_coords=None)[source]#
A neural network acting on graphs defined by electrons and nuclei.
Derived from
GraphNeuralNetwork
.- Parameters:
hamil (
MolecularHamiltonian
) – the Hamiltonian of the system on which the graph is defined.embedding_dim (int) – the length of the electron embedding vectors.
n_interactions (int) – number of message passing interactions.
edge_features (dict) –
a
dict
of functions for each edge type, embedding the interparticle differences. Valid keys are:'ne'
: for nucleus-electron edges'nn'
: for nucleus-nucleus edges'same'
: for same spin electron-electron edges'anti'
: for opposite spin electron-electron edges'up'
: for edges going from spin up electrons to all electrons'down'
: for edges going from spin down electrons to all electrons
self_interaction (bool) – whether to consider edges where the sender and receiver electrons are the same.
two_particle_stream_dim (int) – the feature dimension of the two particle streams. Only active if
deep_features
are used.nuclei_embedding (Type[NucleiEmbedding]) – optional, the instance responsible for creating the initial nuclear embeddings. Set to
None
if nuclear embeddings are not needed.electron_embedding (Type[ElectronEmbedding]) – the instance that creates the initial electron embeddings.
layer_factory (Type[ElectronGNNLayer]) – a callable that generates a layer of the GNN.
ghost_coords (jax.Array) – optional, specifies the coordinates of one or more ghost atoms, useful for breaking spatial symmetries of the nuclear geometry.
- class deepqmc.gnn.electron_gnn.ElectronGNNLayer(n_interactions, ilayer, n_nuc, n_up, n_down, embedding_dim, edge_types, self_interaction, node_data, two_particle_stream_dim, *, electron_residual, nucleus_residual, two_particle_residual, deep_features, update_features, update_rule, subnet_factory=None, subnet_factory_by_lbl=None)[source]#
The message passing layer of
ElectronGNN
.Derived from
MessagePassingLayer
.- Parameters:
n_interactions (int) – the number of message passing interactions.
ilayer (int) – the index of this layer (0 <= ilayer < n_interactions).
n_nuc (int) – the number of nuclei.
n_up (int) – the number of spin up electrons.
n_down (int) – the number of spin down electrons.
embedding_dim (int) – the length of the electron embedding vectors.
edge_types (Tuple[str]) – the types of edges to consider.
self_interaction (bool) – whether to consider edges where the sender and receiver electrons are the same.
node_data (Dict[str, Any]) – a dictionary containing information about the nodes of the graph.
two_particle_stream_dim (int) – the feature dimension of the two particle streams.
electron_residual – whether a residual connection is used when updating the electron embeddings, either
False
, or an instance ofResidualConnection
.nucleus_residual – whether a residual connection is used when updating the nucleus embeddings, either
False
, or an instance ofResidualConnection
.two_particle_residual – whether a residual connection is used when updating the two particle embeddings, either
False
, or an instance ofResidualConnection
.deep_features – if
False
, the edge features are not updated throughout the GNN layers, ifshared
than in each layer a single MLP (u
) is used to update all edge types, ifseparate
then in each layer separate MLPs are used to update the different edge types.update_features (list[UpdateFeature]) – a list of partially initialized update feature classes to use when computing the update features of the one particle embeddings. For more details see the documentation of
deepqmc.gnn.update_features
.update_rule (str) –
how to combine the update features for the update of the one particle embeddings. Possible values:
'concatenate'
: run concatenated features through MLP'featurewise'
: apply different MLP to each feature channel and sum'featurewise_shared'
: apply the same MLP across feature channels'sum'
: sum features before sending through an MLP
note that
'sum'
and'featurewise_shared'
imply features of same size.subnet_factory (Callable) – optional, a function that constructs the subnetworks of the GNN layer.
subnet_factory_by_lbl (dict) – optional, a dictionary of functions that construct subnetworks of the GNN layer. If both this and
subnet_factory
is specified, the specified values ofsubnet_factory_by_lbl
will take precedence. If some keys are missing, the default value ofsubnet_factory
will be used in their place. Possible keys are: (w
,h
,g
oru
).
- class deepqmc.gnn.electron_gnn.NucleiEmbedding(n_up, n_down, charges, n_atom_types, *, embedding_dim, atom_type_embedding, subnet_type, edge_features)[source]#
Create initial embeddings for nuclei.
- Parameters:
n_up (int) – the number of spin up electrons.
n_down (int) – the number of spin down electrons.
charges (jax.Array) – the nuclear charges of the molecule.
n_atom_types (int) – the number of different atom types in the molecule.
embedding_dim (int) – the length of the output embedding vector
atom_type_embedding (bool) – if
True
, initial embeddings are the same for atoms of the same type (nuclear charge), otherwise they are different for all nuclei.subnet_type (str) – the type of subnetwork to use for the embedding generation: -
'mlp'
: an MLP is used -'embed'
: ahaiku.Embed
block is usededge_features (EdgeFeature) – optional, the edge features to use when constructing the initial nuclear embeddings.
Update Features#
This submodule implements some common ways to compute update features for the node embeddings from the current node and edge embeddings. Instances of the below classes are callable, they take as input the current node and edge representations, and output a list of update features to be used for updating the node representations.
- class deepqmc.gnn.update_features.ConvolutionElectronUpdateFeature(*args, edge_types, normalize, w_factory, h_factory, w_for_ne=True)[source]#
The convolution of node and edge embeddings as an update feature.
Returns the convolution of the node and edge embeddings for various edge types as separate update features.
- Parameters:
n_up (int) – number of spin up electrons
n_down (int) – number of spin down electrons
two_particle_stream_dim (int) – dimension of the two-particle stream
node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.
normalize (bool) – whether to normalize the sum by the number of senders
w_factory (Callable) – factory function for the \(w\) matrix
h_factory (Callable) – factory function for the \(h\) matrix
w_for_ne (bool) – whether to use the \(w\) matrix for the \(ne\) edge type
- class deepqmc.gnn.update_features.EdgeSumElectronUpdateFeature(*args, edge_types, normalize)[source]#
The (normalized) sum of the edge embeddings as an update feature.
Returns the (normalized) sum of the edge embeddings for various edge types as separate update features.
- Parameters:
n_up (int) – number of spin up electrons
n_down (int) – number of spin down electrons
two_particle_stream_dim (int) – dimension of the two-particle stream
node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.
normalize (bool) – whether to normalize the sum by the number of senders
- class deepqmc.gnn.update_features.NodeAttentionElectronUpdateFeature(*args, num_heads, mlp_factory, attention_residual, mlp_residual)[source]#
Create a single update feature by attenting over the nodes.
Returns the Psiformer update feature based on attention over the nodes.
- Parameters:
n_up (int) – number of spin up electrons
n_down (int) – number of spin down electrons
two_particle_stream_dim (int) – dimension of the two-particle stream
node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.
num_heads (int) – number of attention heads
attention_residual (Optional[Residual]) – optional residual connection after the attention layer
mlp_residual (Optional[Residual]) – optional residual connection after the MLP layer
- class deepqmc.gnn.update_features.NodeSumElectronUpdateFeature(*args, node_types, normalize)[source]#
The (normalized) sum of the node embeddings as an update feature.
Returns the (normalized) sum of the electron embeddings from the previous layer as a single update feature.
- Parameters:
n_up (int) – number of spin up electrons
n_down (int) – number of spin down electrons
two_particle_stream_dim (int) – dimension of the two-particle stream
node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.
normalize (bool) – whether to normalize the sum by the number of nodes
- class deepqmc.gnn.update_features.ResidualElectronUpdateFeature(n_up, n_down, two_particle_stream_dim, node_edge_mapping)[source]#
Residual update feature.
Returns the unchanged electron embeddings from the previous layer as a single update feature.
- class deepqmc.gnn.update_features.UpdateFeature(n_up, n_down, two_particle_stream_dim, node_edge_mapping)[source]#
Base class for all update features.
- Parameters:
n_up (int) – number of spin up electrons
n_down (int) – number of spin down electrons
two_particle_stream_dim (int) – dimension of the two-particle stream
node_edge_mapping (NodeEdgeMapping) – mapping between the various node and edge types.
Excited electronic states#
This section documents the API used to treat electronic excited states.
- class deepqmc.sampling.combined_samplers.MultiElectronicStateSampler(sampler, n_state)[source]#
Sample from multiple electronic states in parallel.
This sampler applies
vmap
to an underlyingElectronSampler
to sample from multiple electronic states in parallel.- Parameters:
sampler (ElectronSampler) – the electron sampler to use.
n_state (int) – the number of electronic states to sample from.
- class deepqmc.loss.overlap.OverlapGradientScaleFactory(*args, **kwargs)[source]#
Callable that computes the scaling factor of the overlap gradient.
- deepqmc.loss.overlap.compute_mean_overlap(psi_ratio, weight)[source]#
Compute an estimate of the overlap matrix from WF ratios.
- Parameters:
psi_ratio (jax.Array) – the WF ratios \(\text{ratio}[i,\,j,\,:]=\frac{\Psi_i(r\sim\Psi^2_j)}{\Psi_j(r\sim\Psi^2_j)}\), shape:
[n_wfs, n_wfs, elec_batch_size]
.weight (Weight) – the sample weights, shape
[n_wfs, elec_batch_size]
.
- Returns:
tuple of the symmetric overlap matrix estimate, shaped
[mol_batch_size, n_wfs, n_wfs]
, and overlap statistics.- Return type:
- deepqmc.loss.overlap.compute_mean_overlap_tangent(psi_ratio, weight, log_psi_tangent, ratio_gradient_mask, overlap, scale_factory, data)[source]#
Compute the tangent of the overlap matrix with respect to the Ansatz parameters.
- Parameters:
psi_ratio (jax.Array) – the ratio of WF values.
weight (Weight) – the weight of each sample.
log_psi_tangent (jax.Array) – the jvp of the WF values with respect to the parameters of the Ansatz.
ratio_gradient_mask (jax.Array) – a samplewise boolean mask to apply to the gradients.
overlap (jax.Array) – the overlap matrix estimate.
scale_factory (OverlapGradientScaleFactory) – function that computes the scaling factor of the overlap gradient.
data (DataDict) – input data passed to the
scale_factory
function.
- Returns:
- the jvp of the sum of the upper triangle of the overlap matrix with
respect to the Ansatz parameters.
- Return type:
- deepqmc.loss.overlap.compute_psi_ratio(ansatz, params, phys_conf)[source]#
Compute the ratio of all wave function for a batch of samples.
- Parameters:
ansatz (Ansatz) – the ansatz object.
params (Params) – the current parameters of the Ansatz.
phys_conf (PhysicalConfiguration) – the input to the Ansatz, shape:
[mol_batch_size, electronic_states, electron_batch_size, ...]
.
- Returns:
- the tuple of the WF ratios and overlap
statistics.
- Return type:
- deepqmc.loss.overlap.compute_single_sample_psi_ratios(psi, mean_log_psi)[source]#
Compute all possible ratios between the WFs for a single sample.
The mean of each WF value is subtracted before computing exponentials to avoid over/underflow.
- Parameters:
- Returns:
- the WF ratios
\(R[i,\,j]=\frac{\Psi_i(r\sim\Psi^2_j)}{\Psi_j(r\sim\Psi^2_j)}\), shape
[electronic_states, electronic_states]
.
- Return type:
- deepqmc.loss.overlap.compute_wave_function_values(ansatz, params, phys_conf)[source]#
Compute the value of all WFs at samples drawn from all WFs.
- Parameters:
params (dict) – PyTree of WF parameters, with leading axis over the different WFs. Shape:
[n_wfs, ...]
.phys_conf (PhysicalConfiguration) – input physical configuration samples, with leading axis over the different WFs the samples were drawn from. Shape:
[n_wfs, elec_batch_size, ...]
- Returns:
- the WF values
\(\Psi[i, \, j, \, :] = \Psi_i({\bf r} \sim \Psi^2_j)\), shape:
[n_wfs, n_wfs, elec_batch_size]
.
- Return type:
- deepqmc.loss.overlap.scale_by_energy_gap(data, min_gap_scale_factor=0.1)[source]#
Scale the overlap gradient by the energy gap between the two states.
- deepqmc.loss.overlap.scale_by_energy_std(data, min_gap_scale_factor=0.01)[source]#
Scale the overlap gradient by the std. dev. of the states’ energies.
- deepqmc.loss.overlap.scale_by_max_gap_std(data, min_gap_scale_factor=0.1)[source]#
Scale the overlap gradient by the max of the energy gap and std. dev.
- deepqmc.loss.overlap.symmetrize_overlap_with_clipped_geometric_mean(x)[source]#
Symmetrize the overlap using the clipped geometric mean of it and its transpose.
Useful for computing an estimate of the overlap matrix using Monte Carlo samples from all WFs. Given input \(x_{ij}\) this function computes:
\[y_{ij}=\text{sign}(x_{ij}) \sqrt{\text{clamp}(0, \, x_{ij} \cdot x_{ji}, \, 1)} \]Note that if the signs of \(x_{ij}\) and \(x_{ji}\) differ, the clamping makes sure that \(y_{ij}\) will be zero. Otherwise the two signs will agree, and we can use either one of them to compute the sign of \(y_{ij}\).
- Parameters:
x – the non-symmetric overlap values: \(x_{ij}=\frac1N\sum_{{\bf r}_j\sim\Psi^2_j}\frac{\Psi_i({\bf r}_j)} {\Psi_j({\bf r}_j)}\)
Custom data types and type aliases#
In order to facilitate the use of the API and enable type checking DeepQMC implements a range of custom types and type aliases.
A combination of electron and nuclei positions is assigned the PhysicalConfiguration type.
- class deepqmc.types.PhysicalConfiguration#
Represent physical configurations of electrons and nuclei.
It currently contains the nuclear and electronic coordinates, along with
mol_idx
, which specifies which nuclear configuration a given sample was obtained from.
Wave function parameters are denoted with the Params type.
- deepqmc.types.Params#
alias of
MutableMapping
Auxiliary data need for the evaluation of the loss comes as a DataDict type.
Statistics obtained during training or evaluation use the Stats type.
Data for the evaluation of the training loss is bundeld as a Batch type.