Documentation

From TurboGAP
Revision as of 13:58, 2 November 2023 by Tigany Zarrouk (talk | contribs) (→‎Calculation mode)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

TurboGAP is a program and associated collection of routines designed for carrying out atomistic calculations based on machine learning interatomic potentials. This page deals with the technical aspects of using TurboGAP; to learn more about the underlying theory, check the GAP theory page. Since it is often easier to learn by example, make sure to take a look at the tutorials to familiarize yourself with TurboGAP.

Calculation mode

There are two basic modes for running a TurboGAP calculation, turbogap predict and turbogap md. They are invoked by simply typing turbogap predict or turbogap md in the command line or a bash script (e.g., to run MD in parallel on 8 CPU cores: mpirun -np 8 turbogap md). Both execution modes require an input file with TurboGAP options, a gap_files directory with the GAP potential to be used in the calculation, and an XYZ file in ASE's extended XYZ format with atomic positions, lattice vectors and chemical species information (for MD, also atomic velocities are needed).

turbogap predict

turbogap predict performs single-point calculation (i.e., the atomic positions are not updated during the simulation) for total energy, local energy, forces and virial pressure. When available for the specific potential, it can also perform a Hirshfeld volume prediction. If the atoms file contains more than one configuration, in the form of concatenated individual atomic structures, TurboGAP will perform predictions for all of them.

turbogap md

turbogap md performs molecular dynamics (default) or energy minimization according to the options specified in the input file. Currently only Velocity-Verlet MD and gradient descent energy minimization are supported. We expect to add support for Monte Carlo and other simulation protocols in the near future. To choose between different methods to propagate the atomic positions, take a look at the optimize keyword. If there are more than one atomic structures in the XYZ file, turbogap md will use the first one as starting point. Note how this differs from the behavior of turbogap predict, where single-point calculations are performed for all the structures in the XYZ file.

turbogap mc

turbogap mc performs (Grand-Canonical) Monte-Carlo simulations. These can be (NVT), (NPT), (mu VT) or (mu PT). Hybrid MC (using molecular dynamics to produce a trial move) can be performed as well as relaxation after specific trial moves. The user can specify a large number of move types. For reference on the specification see Monte-Carlo. The outputs are mc.log (the log file), mc_all.xyz all of the accepted MC steps, mc_trial.xyz which is an .xyz containing a trial move and mc_current.xyz which is the current accepted step. The .xyz files are written every write_xyz=N steps.

Files

Input file (input)

The input file contains the keywords that tell TurboGAP how to perform the single-point or MD calculation requested by the user. A minimal input file (without MD options) contains only information about the structure XYZ file, the location of the potential, and chemical species. An example looks like this:

! Species-specific info
atoms_file = 'atoms.xyz'
pot_file = 'gap_files/cho.gap'
n_species = 3
species = H C O
masses = 1.01 12.01 16.00 ! this is optional for single point, for MD TurboGAP will try to get them from a database if not provided 
e0 = 0. 0. 0. ! this is optional, to specify per-species energy offsets

For a single-point turbogap predict calculation, something like the above is all that is needed. For running MD and other specialized simulations one needs to additionally specify the appropriate keywords. Check MD options for a complete list.

Atoms file (*.xyz)

The atoms file is an atomic structure file in ASE's extended XYZ format. TurboGAP (currently) works exclusively in periodic boundary conditions; this must be taken into consideration when simulating molecular systems or surfaces (i.e., that an appropriate amount of vacuum is present). The format of the XYZ file must conform to the following:

Number_of_atoms
Comment line including Lattice="ax ay az bx by bz cx cy cz" and Properties=species:S:1:pos:R:3[:vel:R:3]
Atom_name_1   posx posy posz (velx vely velz)
Atom_name_2   posx posy posz (velx vely velz)
...
Atom_name_nat posx posy posz (velx vely velz)

where the velocity information is needed for MD (TurboGAP will generate random velocities if not provided). The positions must be in units of Angstrom, the velocities in Angstrom/fs and the masses in amu. TurboGAP XYZ reading adheres strictly to extXYZ format, with "species" (S:1), "pos" (R:3), "vel" (R:3), "fix_atom" (S:3, with values F or T allowed) and "mass" (R:1) read from the Properties attribute. "positions", "velocities", "fix_atoms" and "masses" are used as synonyms for "pos", "vel", "fix_atom" and "mass", respectively.

Potential directory (gap_files/)

The GAP potential files are usually put into a subdirectory under your working directory named gap_files. This subdirectory contains a bunch of files generated with QUIP's gap_fit program, with XML extension, as well as a mumber of other files. The XML files are often enough to run a GAP calculation with QUIP, with the notable exception of potentials with vdW corrections, which might need some preprocessing before they can be used with QUIP. The *.gap file tells TurboGAP how to use the different files to run a GAP calculation. When using these files with TurboGAP you do not need to worry about preprocessing, they're ready to go.

Output files

Besides standard output (basic messages, progress bar, etc.) that you get printed to stdout, TurboGAP produces one or two output files, depending on whether you are running a static calculation (turbogap predict) or molecular dynamics (turbogap md). Output file trajectory_out.xyz is always written out, and it contains atomic positions, predicted energy and forces, etc. Output file thermo.log is only written when doing MD, and it contains basic thermodynamic information (energy, temperature, pressure, etc.). One can control the frequency with which each file is written (for MD only) with write_xyz and write_thermo, respectively. For finer control over which properties are written and which are not, refer to writeouts.

Parallel support

Parallel support in TurboGAP is provided specifically via MPI. To build the TurboGAP code with MPI support you need an MPI-enabled Fortran compiler. TurboGAP is routinely tested and works reliably with the gfortran MPI wrapper, usually called mpif90. It should also be possible to build TurboGAP with Intel's mpifort, but we do not usually test the code with it. Note that the BLAS/LAPACK libraries used by TurboGAP should be compiled with the same compiler suite used to build TurboGAP, to ensure compatibility. Also note that OpenMP support can be available from BLAS/LAPACK. In that case, hybrid MPI/OpenMP TurboGAP execution can be achieved, although be mindful that OpenMP acceleration can only be exploited for energy and force evaluation, not descriptor construction. We recommend to run TurboGAP with exclusive MPI parallelization, and our tests showed MPI performance to be superior to hybrid MPI/OpenMP. Since system architecture and the details of the potential might affect said performance, run your own tests to evaluate whether you gain speed up from BLAS/LAPACK's threading support.