\(\renewcommand{\AA}{\text{Å}}\)
fitpod command
Syntax
fitpod Ta_param.pod Ta_data.pod
fitpod = style name of this command
Ta_param.pod = an input file that describes proper orthogonal descriptors (PODs)
Ta_data.pod = an input file that specifies DFT data used to fit a POD potential
Examples
fitpod Ta_param.pod Ta_data.pod
Description
New in version 22Dec2022.
Fit a machine-learning interatomic potential (ML-IAP) based on proper orthogonal descriptors (POD). Two input files are required for this command. The first input file describes a POD potential parameter settings, while the second input file specifies the DFT data used for the fitting procedure.
The table below has one-line descriptions of all the keywords that can
be used in the first input file (i.e. Ta_param.pod
in the example
above):
Keyword |
Default |
Type |
Description |
---|---|---|---|
species |
(none) |
STRING |
Chemical symbols for all elements in the system and have to match XYZ training files. |
pbc |
1 1 1 |
INT |
three integer constants specify boundary conditions |
rin |
1.0 |
REAL |
a real number specifies the inner cut-off radius |
rcut |
5.0 |
REAL |
a real number specifies the outer cut-off radius |
bessel_polynomial_degree |
3 |
INT |
the maximum degree of Bessel polynomials |
inverse_polynomial_degree |
6 |
INT |
the maximum degree of inverse radial basis functions |
onebody |
1 |
BOOL |
turns on/off one-body potential |
twobody_number_radial_basis_functions |
6 |
INT |
number of radial basis functions for two-body potential |
threebody_number_radial_basis_functions |
5 |
INT |
number of radial basis functions for three-body potential |
threebody_number_angular_basis_functions |
5 |
INT |
number of angular basis functions for three-body potential |
fourbody_snap_twojmax |
0 |
INT |
band limit for SNAP bispectrum components (0,2,4,6,8… allowed) |
fourbody_snap_chemflag |
0 |
BOOL |
turns on/off the explicit multi-element variant of the SNAP bispectrum components |
quadratic_pod_potential |
0 |
BOOL |
turns on/off quadratic POD potential |
All keywords except species have default values. If a keyword is not
set in the input file, its default value is used. The next table
describes all keywords that can be used in the second input file
(i.e. Ta_data.pod
in the example above):
Keyword |
Default |
Type |
Description |
---|---|---|---|
file_format |
extxyz |
STRING |
only the extended xyz format (extxyz) is currently supported |
file_extension |
xyz |
STRING |
extension of the data files |
path_to_training_data_set |
(none) |
STRING |
specifies the path to training data files in double quotes |
path_to_test_data_set |
“” |
STRING |
specifies the path to test data files in double quotes |
fraction_training_data_set |
1.0 |
REAL |
a real number (<= 1.0) specifies the fraction of the training set used to fit POD |
randomize_training_data_set |
0 |
BOOL |
turns on/off randomization of the training set |
fitting_weight_energy |
100.0 |
REAL |
a real constant specifies the weight for energy in the least-squares fit |
fitting_weight_force |
1.0 |
REAL |
a real constant specifies the weight for force in the least-squares fit |
fitting_regularization_parameter |
1.0e-10 |
REAL |
a real constant specifies the regularization parameter in the least-squares fit |
error_analysis_for_training_data_set |
0 |
BOOL |
turns on/off error analysis for the training data set |
error_analysis_for_test_data_set |
0 |
BOOL |
turns on/off error analysis for the test data set |
basename_for_output_files |
pod |
STRING |
a basename string added to the output files |
precision_for_pod_coefficients |
8 |
INT |
number of digits after the decimal points for numbers in the coefficient file |
All keywords except path_to_training_data_set have default values. If a keyword is not set in the input file, its default value is used. After successful training, a number of output files are produced, if enabled:
<basename>_training_errors.pod
reports the errors in energy and forces for the training data set<basename>_training_analysis.pod
reports detailed errors for all training configurations<basename>_test_errors.pod
reports errors for the test data set<basename>_test_analysis.pod
reports detailed errors for all test configurations<basename>_coefficients.pod
contains the coefficients of the POD potential
After training the POD potential, Ta_param.pod
and <basename>_coefficients.pod
are the two files needed to use the POD potential in LAMMPS. See
pair_style pod for using the POD potential. Examples
about training and using POD potentials are found in the directory
lammps/examples/PACKAGES/pod.
Parameterized Potential Energy Surface
We consider a multi-element system of N atoms with \(N_{\rm e}\) unique elements. We denote by \(\boldsymbol r_n\) and \(Z_n\) position vector and type of an atom n in the system, respectively. Note that we have \(Z_n \in \{1, \ldots, N_{\rm e} \}\), \(\boldsymbol R = (\boldsymbol r_1, \boldsymbol r_2, \ldots, \boldsymbol r_N) \in \mathbb{R}^{3N}\), and \(\boldsymbol Z = (Z_1, Z_2, \ldots, Z_N) \in \mathbb{N}^{N}\). The potential energy surface (PES) of the system can be expressed as a many-body expansion of the form
where \(V^{(1)}\) is the one-body potential often used for representing external field or energy of isolated elements, and the higher-body potentials \(V^{(2)}, V^{(3)}, \ldots\) are symmetric, uniquely defined, and zero if two or more indices take identical values. The superscript on each potential denotes its body order. Each q-body potential \(V^{(q)}\) depends on \(\boldsymbol \mu^{(q)}\) which are sets of parameters to fit the PES. Note that \(\boldsymbol \mu\) is a collection of all potential parameters \(\boldsymbol \mu^{(1)}\), \(\boldsymbol \mu^{(2)}\), \(\boldsymbol \mu^{(3)}\), etc, and that \(\boldsymbol \eta\) is a set of hyper-parameters such as inner cut-off radius \(r_{\rm in}\) and outer cut-off radius \(r_{\rm cut}\).
Interatomic potentials rely on parameters to learn relationship between atomic environments and interactions. Since interatomic potentials are approximations by nature, their parameters need to be set to some reference values or fitted against data by necessity. Typically, potential fitting finds optimal parameters, \(\boldsymbol \mu^*\), to minimize a certain loss function of the predicted quantities and data. Since the fitted potential depends on the data set used to fit it, different data sets will yield different optimal parameters and thus different fitted potentials. When fitting the same functional form on Q different data sets, we would obtain Q different optimized potentials, \(E(\boldsymbol R,\boldsymbol Z, \boldsymbol \eta, \boldsymbol \mu_q^*), 1 \le q \le Q\). Consequently, there exist many different sets of optimized parameters for empirical interatomic potentials.
Instead of optimizing the potential parameters, inspired by the reduced basis method (Grepl) for parameterized partial differential equations, we view the parameterized PES as a parametric manifold of potential energies
where \(\Omega^{\boldsymbol \mu}\) is a parameter domain in which \(\boldsymbol \mu\) resides. The parametric manifold \(\mathcal{M}\) contains potential energy surfaces for all values of \(\boldsymbol \mu \in \Omega^{\boldsymbol \mu}\). Therefore, the parametric manifold yields a much richer and more transferable atomic representation than any particular individual PES \(E(\boldsymbol R, \boldsymbol Z, \boldsymbol \eta, \boldsymbol \mu^*)\).
We propose specific forms of the parameterized potentials for one-body, two-body, and three-body interactions. We apply the Karhunen-Loeve expansion to snapshots of the parameterized potentials to obtain sets of orthogonal basis functions. These basis functions are aggregated according to the chemical elements of atoms, thus leading to multi-element proper orthogonal descriptors.
Proper Orthogonal Descriptors
Proper orthogonal descriptors are finger prints characterizing the radial and angular distribution of a system of atoms. The detailed mathematical definition is given in the paper by Nguyen and Rohskopf (Nguyen).
The descriptors for the one-body interaction are used to capture energy of isolated elements and defined as follows
for \(1 \le i \le N, 1 \le p \le N_{\rm e}\). The number of one-body descriptors per atom is equal to the number of elements. The one-body descriptors are independent of atom positions, but dependent on atom types. The one-body descriptors are active only when the keyword onebody is set to 1.
We adopt the usual assumption that the direct interaction between two atoms vanishes smoothly when their distance is greater than the outer cutoff distance \(r_{\rm cut}\). Furthermore, we assume that two atoms can not get closer than the inner cutoff distance \(r_{\rm in}\) due to the Pauli repulsion principle. Let \(r \in (r_{\rm in}, r_{\rm cut})\), we introduce the following parameterized radial functions
where the scaled distance function \(x\) is defined below to enrich the two-body manifold
We introduce the following function as a convex combination of the two functions
We see that \(\psi\) is a function of distance \(r\), cut-off distances \(r_{\rm in}\) and \(r_{\rm cut}\), and parameters \(\alpha, \beta, \gamma, \kappa\). Together these parameters allow the function \(\psi\) to characterize a diverse spectrum of two-body interactions within the cut-off interval \((r_{\rm in}, r_{\rm cut})\).
Next, we introduce the following parameterized potential
where \(\eta_1 = r_{\rm in}, \eta_2 = r_{\rm cut}, \mu_1^{(2)} = \alpha, \mu_2^{(2)} = \beta, \mu_3^{(2)} = \gamma\), and \(\mu_4^{(2)} = \kappa\). Here the cut-off function \(f_{\rm c}(r_{ij}, \boldsymbol \eta)\) proposed in [refs] is used to ensure the smooth vanishing of the potential and its derivative for \(r_{ij} \ge r_{\rm cut}\):
Based on the parameterized potential, we form a set of snapshots as follows. We assume that we are given \(N_{\rm s}\) parameter tuples \(\boldsymbol \mu^{(2)}_\ell, 1 \le \ell \le N_{\rm s}\). We introduce the following set of snapshots on \((r_{\rm in}, r_{\rm cut})\):
To ensure adequate sampling of the PES for different parameters, we choose \(N_{\rm s}\) parameter points \(\boldsymbol \mu^{(2)}_\ell = (\alpha_\ell, \beta_\ell, \gamma_\ell, \kappa_\ell), 1 \le \ell \le N_{\rm s}\) as follows. The parameters \(\alpha \in [1, N_\alpha]\) and \(\gamma \in [1, N_\gamma]\) are integers, where \(N_\alpha\) and \(N_\gamma\) are the highest degrees for \(\alpha\) and \(\gamma\), respectively. We next choose \(N_\beta\) different values of \(\beta\) in the interval \([\beta_{\min}, \beta_{\max}]\), where \(\beta_{\min} = 0\) and \(\beta_{\max} = 4\). The parameter \(\kappa\) can be set either 0 or 1. Hence, the total number of parameter points is \(N_{\rm s} = N_\alpha N_\beta + N_\gamma\). Although \(N_\alpha, N_\beta, N_\gamma\) can be chosen conservatively large, we find that \(N_\alpha = 6, N_\beta = 3, N_\gamma = 8\) are adequate for most problems. Note that \(N_\alpha\) and \(N_\gamma\) correspond to bessel_polynomial_degree and inverse_polynomial_degree, respectively.
We employ the Karhunen-Loeve (KL) expansion to generate an orthogonal basis set which is known to be optimal for representation of the snapshot family \(\{\xi_\ell\}_{\ell=1}^{N_{\rm s}}\). The two-body orthogonal basis functions are computed as follows
where the matrix \(\boldsymbol A \in \mathbb{R}^{N_{\rm s} \times N_{\rm s}}\) consists of eigenvectors of the eigenvalue problem
with the entries of \(\boldsymbol C \in \mathbb{R}^{N_{\rm s} \times N_{\rm s}}\) being given by
Note that the eigenvalues \(\lambda_\ell, 1 \le \ell \le N_{\rm s}\), are ordered such that \(\lambda_1 \ge \lambda_2 \ge \ldots \ge \lambda_{N_{\rm s}}\), and that the matrix \(\boldsymbol A\) is pe-computed and stored for any given \(\boldsymbol \eta\). Owing to the rapid convergence of the KL expansion, only a small number of orthogonal basis functions is needed to obtain accurate approximation. The value of \(N_{\rm 2b}\) corresponds to twobody_number_radial_basis_functions.
The two-body proper orthogonal descriptors at each atom i are computed by summing the orthogonal basis functions over the neighbors of atom i and numerating on the atom types as follows
for \(1 \le i \le N, 1 \le m \le N_{\rm 2b}, 1 \le q, p \le N_{\rm e}\). Here \(l(p,q)\) is a symmetric index mapping such that
The number of two-body descriptors per atom is thus \(N_{\rm 2b} N_{\rm e}(N_{\rm e}+1)/2\).
It is important to note that the orthogonal basis functions do not depend on the atomic numbers \(Z_i\) and \(Z_j\). Therefore, the cost of evaluating the basis functions and their derivatives with respect to \(r_{ij}\) is independent of the number of elements \(N_{\rm e}\). Consequently, even though the two-body proper orthogonal descriptors depend on \(\boldsymbol Z\), their computational complexity is independent of \(N_{\rm e}\).
In order to provide proper orthogonal descriptors for three-body interactions, we need to introduce a three-body parameterized potential. In particular, the three-body potential is defined as a product of radial and angular functions as follows
where \(\sigma\) is the periodic multiplicity, \(\zeta\) is the equilibrium angle, \(\boldsymbol \mu^{(3)} = (\alpha, \beta, \gamma, \kappa, \sigma, \zeta)\). The three-body potential provides an angular fingerprint of the atomic environment through the bond angles \(\theta_{ijk}\) formed with each pair of neighbors \(j\) and \(k\). Compared to the two-body potential, the three-body potential has two extra parameters \((\sigma, \zeta)\) associated with the angular component.
Let \(\boldsymbol \varrho = (\alpha, \beta, \gamma, \kappa)\). We assume that we are given \(L_{\rm r}\) parameter tuples \(\boldsymbol \varrho_\ell, 1 \le \ell \le L_{\rm r}\). We introduce the following set of snapshots on \((r_{\min}, r_{\max})\):
We apply the Karhunen-Loeve (KL) expansion to this set of snapshots to obtain orthogonal basis functions as follows
where the matrix \(\boldsymbol A \in \mathbb{R}^{L_{\rm r} \times L_{\rm r}}\) consists of eigenvectors of the eigenvalue problem. For the parameterized angular function, we consider angular basis functions
where \(N_{\rm a}\) is the number of angular basis functions. The orthogonal basis functions for the parameterized potential are computed as follows
for \(1 \le m \le N_{\rm r}, 1 \le n \le N_{\rm a}\). The number of three-body orthogonal basis functions is equal to \(N_{\rm 3b} = N_{\rm r} N_{\rm a}\) and independent of the number of elements. The value of \(N_{\rm r}\) corresponds to threebody_number_radial_basis_functions, while that of \(N_{\rm a}\) to threebody_number_angular_basis_functions.
The three-body proper orthogonal descriptors at each atom i are obtained by summing over the neighbors j and k of atom i as
for \(1 \le i \le N, 1 \le m \le N_{\rm r}, 1 \le n \le N_{\rm a}, 1 \le q, p, s \le N_{\rm e}\), where
The number of three-body descriptors per atom is thus \(N_{\rm 3b} N_{\rm e}^2(N_{\rm e}+1)/2\). While the number of three-body PODs is cubic function of the number of elements, the computational complexity of the three-body PODs is independent of the number of elements.
Four-Body SNAP Descriptors
In addition to the proper orthogonal descriptors described above, we also employ the spectral neighbor analysis potential (SNAP) descriptors. SNAP uses bispectrum components to characterize the local neighborhood of each atom in a very general way. The mathematical definition of the bispectrum calculation and its derivatives w.r.t. atom positions is described in compute snap. In SNAP, the total energy is decomposed into a sum over atom energies. The energy of atom i is expressed as a weighted sum over bispectrum components.
where the SNAP descriptors are related to the bispectrum components by
Here \(B_{ik}\) is the k-th bispectrum component of atom i. The number of bispectrum components \(N_{\rm 4b}\) depends on the value of fourbody_snap_twojmax \(= 2 J_{\rm max}\) and fourbody_snap_chemflag. If fourbody_snap_chemflag = 0 then \(N_{\rm 4b} = (J_{\rm max}+1)(J_{\rm max}+2)(J_{\rm max}+1.5)/3\). If fourbody_snap_chemflag = 1 then \(N_{\rm 4b} = N_{\rm e}^3 (J_{\rm max}+1)(J_{\rm max}+2)(J_{\rm max}+1.5)/3\). The bispectrum calculation is described in more detail in compute sna/atom.
Linear Proper Orthogonal Descriptor Potentials
The proper orthogonal descriptors and SNAP descriptors are used to define the atomic energies in the following expansion
where \(D^{(1)}_{ip}, D^{(2)}_{iml}, D^{(3)}_{imn\ell}, D^{(4)}_{ikp}\) are the one-body, two-body, three-body, four-body descriptors, respectively, and \(c^{(1)}_p, c^{(2)}_{ml}, c^{(3)}_{mn\ell}, c^{(4)}_{kp}\) are their respective expansion coefficients. In a more compact notation that implies summation over descriptor indices the atomic energies can be written as
where \(N_{\rm d}^{(2)} = N_{\rm 2b} N_{\rm e} (N_{\rm e}+1)/2\), \(N_{\rm d}^{(3)} = N_{\rm 3b} N_{\rm e}^2 (N_{\rm e}+1)/2\), and \(N_{\rm d}^{(4)} = N_{\rm 4b} N_{\rm e}\) are the number of two-body, three-body, and four-body descriptors, respectively.
The potential energy is then obtained by summing local atomic energies \(E_i\) for all atoms \(i\) in the system
Because the descriptors are one-body, two-body, and three-body terms, the resulting POD potential is a three-body PES. We can express the potential energy as a linear combination of the global descriptors as follows
where the global descriptors are given by
Hence, we obtain the atomic forces as
where \(\nabla d_m^{(2)}\), \(\nabla d_m^{(3)}\) and \(\nabla d_m^{(4)}\) are derivatives of the two-body three-body, and four-body global descriptors with respect to atom positions, respectively. Note that since the first-body global descriptors are constant, their derivatives are zero.
Quadratic Proper Orthogonal Descriptor Potentials
We recall two-body PODs \(D^{(2)}_{ik}, 1 \le k \le N_{\rm d}^{(2)}\), and three-body PODs \(D^{(3)}_{im}, 1 \le m \le N_{\rm d}^{(3)}\), with \(N_{\rm d}^{(2)} = N_{\rm 2b} N_{\rm e} (N_{\rm e}+1)/2\) and \(N_{\rm d}^{(3)} = N_{\rm 3b} N_{\rm e}^2 (N_{\rm e}+1)/2\) being the number of descriptors per atom for the two-body PODs and three-body PODs, respectively. We employ them to define a new set of atomic descriptors as follows
for \(1 \le i \le N, 1 \le k \le N_{\rm d}^{(2)}, 1 \le m \le N_{\rm d}^{(3)}\). The new descriptors are four-body because they involve central atom \(i\) together with three neighbors \(j, k\) and \(l\). The total number of new descriptors per atom is equal to
The new global descriptors are calculated as
for \(1 \le k \le N_{\rm d}^{(2)}, 1 \le m \le N_{\rm d}^{(3)}\). Hence, the gradient of the new global descriptors with respect to atom positions is calculated as
The quadratic POD potential is defined as a linear combination of the original and new global descriptors as follows
It thus follows that
which is simplified to
where
The quadratic POD potential results in the following atomic forces
It can be shown that
The calculation of the atomic forces for the quadratic POD potential only requires the extra calculation of \(b_k^{(2)}\) and \(b_m^{(3)}\) which can be negligible. As a result, the quadratic POD potential does not increase the computational complexity.
Training
POD potentials are trained using the least-squares regression against density functional theory (DFT) data. Let \(J\) be the number of training configurations, with \(N_j\) being the number of atoms in the j-th configuration. Let \(\{E^{\star}_j\}_{j=1}^{J}\) and \(\{\boldsymbol F^{\star}_j\}_{j=1}^{J}\) be the DFT energies and forces for \(J\) configurations. Next, we calculate the global descriptors and their derivatives for all training configurations. Let \(d_{jm}, 1 \le m \le M\), be the global descriptors associated with the j-th configuration, where \(M\) is the number of global descriptors. We then form a matrix \(\boldsymbol A \in \mathbb{R}^{J \times M}\) with entries \(A_{jm} = d_{jm}/ N_j\) for \(j=1,\ldots,J\) and \(m=1,\ldots,M\). Moreover, we form a matrix \(\boldsymbol B \in \mathbb{R}^{\mathcal{N} \times M}\) by stacking the derivatives of the global descriptors for all training configurations from top to bottom, where \(\mathcal{N} = 3\sum_{j=1}^{J} N_j\).
The coefficient vector \(\boldsymbol c\) of the POD potential is found by solving the following least-squares problem
where \(w_E\) and \(w_F\) are weights for the energy (fitting_weight_energy) and force (fitting_weight_force), respectively; and \(w_R\) is the regularization parameter (fitting_regularization_parameter). Here \(\bar{\boldsymbol E}^{\star} \in \mathbb{R}^{J}\) is a vector of with entries \(\bar{E}^{\star}_j = E^{\star}_j/N_j\) and \(\boldsymbol F^{\star}\) is a vector of \(\mathcal{N}\) entries obtained by stacking \(\{\boldsymbol F^{\star}_j\}_{j=1}^{J}\) from top to bottom.
The training procedure is the same for both the linear and quadratic POD potentials. However, since the quadratic POD potential has a significantly larger number of the global descriptors, it is more expensive to train the linear POD potential. This is because the training of the quadratic POD potential still requires us to calculate and store the quadratic global descriptors and their gradient. Furthermore, the quadratic POD potential may require more training data in order to prevent over-fitting. In order to reduce the computational cost of fitting the quadratic POD potential and avoid over-fitting, we can use subsets of two-body and three-body PODs for constructing the new descriptors.
Restrictions
This command is part of the ML-POD package. It is only enabled if LAMMPS was built with that package. See the Build package page for more info.
Default
The keyword defaults are also given in the description of the input files.
(Grepl) Grepl, Maday, Nguyen, and Patera, ESAIM: Mathematical Modelling and Numerical Analysis 41(3), 575-605, (2007).
(Nguyen) Nguyen and Rohskopf, arXiv preprint arXiv:2209.02362 (2022).