Neural Network Potential#

Overview#

Tinker-GPU implements neural network potentials (NNPs) as native force-field components, following the Behler-Parrinello framework, with ANI-style descriptors. Each atom’s energy contribution is predicted by a species-specific feed-forward neural network whose input is an Atomic Environment Vector (AEV), a fixed-length descriptor of the local chemical environment. The total energy is a sum over all atoms in the designated groups:

\[U = \sum_i E_i(\mathbf{G}_i),\]

where \(\mathbf{G}_i\) is the AEV of atom i and \(E_i\) is the output of the element-specific network.

Two NNP term types are currently available:

  • NN Valence (NNVAL) — replaces classical bonded interactions (bonds, angles, torsions, etc.) for a designated group of atoms with a valence-type NNP.

  • NN Metal (NNMET) — adds an intermolecular NNP correction for metal ions in a designated group.

Both terms are integrated into the same resource-management, neighbor-list, energy, and gradient pipelines used by classical force-field terms.

Atomic Environment Vector#

The AEV consists of radial and angular symmetry functions.

Radial term. For each covered species s and radial shift index m, the contribution summed over neighbors j of species s is

\[G^{\mathrm rad}_{m,s} = \frac{1}{4}\sum_{j \in s} \exp\!\left[-\eta^{\mathrm rad} \left(r_{ij} - R_{m}\right)^2\right] f_c(r_{ij};\, R_{\mathrm cutoff}^{\mathrm rad}),\]

where \(r_{ij}\) is the interatomic distance, \(\eta^{\mathrm rad}\) is the Gaussian width, and \(R_m\) are evenly spaced radial shifts in \([R_0^{\mathrm rad},\,R_{\mathrm cutoff}^{\mathrm rad}]\) with \(M\) divisions.

Angular term. For each covered species pair (s1, s2), angular shift index p, and radial shift index q,

\[\begin{split}G^{\mathrm ang}_{p,q,s_1 s_2} = 2 \sum_{\substack{j \in s_1,\, k \in s_2 \\ j \neq k}} \!\left(\frac{1 + \cos(\theta_{ijk} - \theta_p)}{2}\right)^{\!\zeta^{\mathrm ang}} \exp\!\left[-\eta^{\mathrm ang} \!\left(\bar{r}_{ijk} - R_q\right)^{\!2}\right] f_c(r_{ij};\, R_{\mathrm cutoff}^{\mathrm ang})\, f_c(r_{ik};\, R_{\mathrm cutoff}^{\mathrm ang}),\end{split}\]

where \(\bar{r}_{ijk} = (r_{ij}+r_{ik})/2\), \(\zeta^{\mathrm ang}\) is the angular sharpness exponent, \(\eta^{\mathrm ang}\) is the angular radial Gaussian width, and \(R_q\) are evenly spaced in \([R_{0}^{\mathrm ang},\, R_{\mathrm cutoff}^{\mathrm ang}]\) with \(Q\) divisions. The angular shifts \(\theta_p\) are evenly spaced in \([0, \pi)\) with \(P\) divisions.

Cutoff function. Both radial and angular terms use a smooth cosine cutoff:

\[f_c(r;\, R) = \frac{1}{2}\cos\!\left(\frac{\pi r}{R}\right) + \frac{1}{2}.\]

AEV length. With \(N\) covered species, the total descriptor length per atom is

\[l_{\rm AEV} = N M + \frac{N(N+1)}{2} Q P.\]

A topological cutoff can optionally restrict neighbors entering the AEV to those within a given bond distance, independent of the spatial cutoff.

Network Architecture#

Each element species has an independent feed-forward network (MLP) composed of alternating linear (fully-connected) and celu (CELU activation) layers, with a final linear layer producing a scalar atomic energy. Taking the AEV \(\mathbf{G}_i\) as input \(\mathbf{a}^{(0)} = \mathbf{G}_i\), the forward pass for \(L\) hidden layers is

\[\begin{split}\mathbf{z}^{(l)} &= \mathbf{W}^{(l)} \mathbf{a}^{(l-1)} + \mathbf{b}^{(l)}, \quad l = 1, \dots, L, \\ \mathbf{a}^{(l)} &= \text{CELU}\!\left(\mathbf{z}^{(l)}\right), \quad l = 1, \dots, L-1, \\ E_i &= \mathbf{W}^{(L+1)} \mathbf{a}^{(L)} + b^{(L+1)},\end{split}\]

where \(\mathbf{W}^{(l)}\) and \(\mathbf{b}^{(l)}\) are the weight matrix and bias vector of the l-th linear layer, both stored in the parameter file. The final output \(E_i\) is a scalar atomic energy contribution.

The CELU activation function applied element-wise is

\[\begin{split}\text{CELU}(x) = \begin{cases} x & x \geq 0, \\ \alpha \left(e^{x/\alpha} - 1\right) & x < 0, \end{cases}\end{split}\]

where \(\alpha\) is a fixed parameter set in the parameter file.

Implementation Details#

Model definition and activation. NNP models are defined in blocks read from the parameter file (.prm) or keyword file (.key). Each block begins with nnp <type> and is followed by aev, nn, linear, and celu lines that specify the descriptor and network architecture. The nnterm keyword maps a potential type to atom groups and enables NNVAL and/or NNMET at runtime.

The NN spatial cutoff is set automatically as

\[r_{\rm cut}^{\rm NN} = \max_{p \in \mathrm{NNPs}} R_{\mathrm cutoff}^{(p)},\]

and stored in nncut for use by the neighbor-list infrastructure.

Species decomposition. For a selected group of atoms, atoms are sorted by atomic number. Subnetworks not needed by any present species are removed before device allocation (remove_nn_unneeded), keeping the AEV array contiguous in device memory while minimizing unnecessary computation.

NN-specific neighbor lists. When NNVAL or NNMET is active, Tinker9 allocates a dedicated spatial list (nnspatial_v2_unit) with nblist4nn=true. Unlike the standard symmetric pair lists used for classical terms, AEV descriptors require directed, per-center neighborhoods. A kernel (ref_nblist_cu) filters spatial neighbors by group membership, covered species, optional topological mask, and radial/angular cutoffs, producing two separate lists: nblist_rad and nblist_ang, used respectively by the radial and angular parts of the AEV kernel. The NN neighbor list is refreshed at the same frequency as other neighbor lists.

Angular stabilization. To regularize angular derivatives near collinear configurations, the AEV kernel applies the substitution

\[\theta_{ijk} \leftarrow \arccos(0.95\cos\theta_{ijk})\]

before evaluating the angular symmetry functions.

Forward pass and energy accumulation. Linear layers are computed via GPU matrix multiplication (genMatMul_cu); CELU activation and its derivative are evaluated in dedicated CUDA kernels. Per-atom scalar energies from all subnetworks are accumulated into the global energy buffer via addToEneBuf_cu, scaled by the global factor nnlambda.

Gradient computation. Forces are obtained by backpropagating from the scalar atomic-energy outputs through the network layers to yield \(\partial E / \partial \mathbf{G}\), then applying analytic AEV derivatives with respect to Cartesian coordinates in aev_cu. The same nnlambda factor scales all force contributions.

NN Valence (NNVAL). Classical bonded terms are computed in evalence_cu for the canonical force field. For each bonded interaction, a device predicate (atom_use_nn) checks whether any atom in the interaction belongs to an NN valence group. If true, that classical interaction is skipped and replaced by the NNP evaluation. This substitution is applied to bond stretching, angle bending, stretch-bend, Urey-Bradley, out-of-plane bending, improper dihedral, improper torsion, torsional angle, pi-orbital torsion, stretch-torsion, angle-torsion, and torsion-torsion terms. Skip counters are tracked on the host so that reported interaction counts in analysis output remain interpretable.

NN Metal (NNMET). NNMET reuses the same AEV and network machinery but accumulates energies and forces into dedicated intermolecular buffers (eng_buf_nnintermol, gx_nnintermol, etc.), dispatched through ennintermol_cu. It is reported as a separate energy component (NN Metal) in analysis output.

Note

Energy and gradient contributions are fully implemented for both NNVAL and NNMET. Virial accumulation for NN terms is not yet implemented.